5/03/2007

'top50 & reg' 'Block-Prior' Average Show


Download: PDF/JPG


Data: vsn-normalization.xls 'Control' Data

Setting:
murange = [0 .1 .5 1 2 4 8 12 16 32 64 100];
10 seeds for vbssm model training

Procedures:
1. Sort 'top50 & reg' as blocks: pdf / xls
2. In k-th experiment, incoporate blocks[1:k] as prior, take the average of 10 seeds for significance computation
3. Analyze the recovered true arcs, Total # = 10.
They are grouped into 2 catogaries: tdcA-arcs & tdcR-arcs, where:

tdcA-arcs:
(color is agree with legend)
tdcA pd tdcB
tdcA pd tdcC
tdcA pd tdcD
tdcA pd tdcE
tdcA pp tdcA

tdcR-arcs: (color is agree with legend)
tdcR pp tdcA
tdcR pp tdcB
tdcR pp tdcC
tdcR pp tdcE
tdcR pp tdcD


More details about this work
1. Sort the 'top50 & reg' as blocks.
That means, arcs with same 'from-gene' are grouped as one block, ignore the 'to-gene'.
E.g. The group containing all tdcA-arcs are named block-1, the group containing all tdcR-arcs are named block-2.
block-idx from to
1 tdcA tdcA
1 tdcA tdcB
1 tdcA tdcD
1 tdcA tdcE
1 tdcA tdcF
1 tdcA tdcG
1 tdcA tdcC

2 tdcR tdcA
2 tdcR tdcB
2 tdcR tdcC
2 tdcR tdcE
2 tdcR tdcF
2 tdcR tdcG
2 tdcR tdcD

>From block-3, block-index is numbered by the alphabet order of 'from-genes'

2. Add prior for vbssm training.
There are 29 experiment, since the block # I sort in 'top50®' is exactly 29. Each experiment also explores mu range.
1st test, add all tdcA-arcs (i.e. block-1) as prior
2nd test, keep the tdcA-arcs (block-1) as prior, meanwhile add tdcR-arcs (block-2) as prior.
....
29th test, all arcs (i.e. block(1:29) ) are added as priors.

3. Organize the 29 experiment data, each murange = [0 .1 .5 1 2 4 8 12 16 32 64 100]
Since mu value play an important role in recovery evaluation, I made the figure to reflect this point.

There is no true network at hand, only 10 known arcs. I named them as 2 groups
tdcA-arcs:(from gene = tdcA, to gene = don't care)
tdcA pd tdcB
tdcA pd tdcC
tdcA pd tdcD
tdcA pd tdcE
tdcA pp tdcA

tdcR-arcs:
(from gene = tdcR, to gene = don't care)
tdcR pp tdcA
tdcR pp tdcB
tdcR pp tdcC
tdcR pp tdcE
tdcR pp tdcD

Different colors, blue and pink, to demonstrate 10 true arcs recovery results. Blue = tdcA-arcs, pink = tdcR-arcs.
The bule+pink stack gives the total number of vbssm identified true arcs.

Based on my understanding, figure revealed at least 2 information:
1. mu = 0.5 is optimum mu value, at which # of recovered arcs reaches peak.
2. In global view, pink-arcs only showed with some special mu, whileas, blue-arcs are not significantly affected by mu value.