The CBA-RG algorithm effectively searches for all the CARs in a dataset based on the Apriori algorithm [16], assuming the downward closure property that for any X, X is frequent if and only if any subset x of X is frequent. Instead of CBA-RG, the Coenen’s CBA program is implemented with the
Apriori-TFP algorithm [17] and [18], a variant of the Apriori algorithms that utilizes a tree-structured data representations for a higher performance. The operation of the latter part, CBA-CB, is described as follows in [6]. “Given two rules, ri and rj. ri GDC0068 ≻ rj (also called ri precedes rj or ri has a higher precedence than rj) if 1. the confidence of ri is greater than that of rj, or Let R be the set of generated rules and D the training data”. CBA-CB is “to choose a set of high precedence rules in R to cover D”. A generated classifier is of the form,
class label, the first rule that satisfies Natural Product Library supplier the sample will classify it. If there is no rule that applies to the sample, it takes on the default class, default_class. Below is a simple example of classifiers. Example: (Gene_01, Inc), (Gene_02, Dec)→(RLW, Inc)(Gene_01, Inc), (Gene_02, Dec)→(RLW, Inc) (Gene_01, Inc), (Gene_03, Inc)→(RLW, Inc)(Gene_01, Inc), (Gene_03, Inc)→(RLW, Inc) (NULL)→(RLW, NI)(NULL)→(RLW, NI) In this example. each line corresponds to a rule included in the classifier. The rule with the (NULL) antecedent means the default rule of this classifier. When a sample, (Gene_01, Inc), (Gene_03, Inc) with an unknown class label
(it is unknown whether RLW is Inc or NI), is classified, the classifier answers (RLW, Inc), as the second rule first satisfies the sample. In another case, where a sample, (Gene_01, Inc), (Gene_02, Inc), is classified, the classifier answers (RLW, NI), as none of the rules except the default rule satisfies the sample and thus the default rule is applied. Prior to the CBA analysis, we have preprocessed gene expression data in the liver (4D) and liver weight data (15D) of rats after repetitive doses for 149 compounds from the TG-GATEs database. Acyl CoA dehydrogenase First, gene expressions were corrected and normalized by the MAS 5.0 algorithm [19] to reduce inter-array variances [20]. Liver weights were transformed into relative liver weight, a ratio of liver weight divided by body weight to avoid large variations in body weight skewing organ weight interpretation [15]. Secondly, values were averaged over individual animals included in each group. Then, for each compound-treated group, a fold change was calculated as a ratio of an average value of a treatment group divided by an average value of its corresponding control group, to reduce inter-study variances [21].