Data

Features

Although we only propose a small number of base HCS parameters in our paper, they can be combined in many ways to capture different structural information about the instance. The following table is a comprehensive list of all the computed features which were used for our machine learning classification and regression results. This includes our proposed HCS parameters as well as other parameters which we believe to be important, such as mergeability. The table contains the name of the feature as it appears in our code, as well as a description of each basic feature.

Feature Name

Description

numVars the number of distinct variables in the formula
numClauses the number of distinct clauses in the formula
CVR numClauses / numVars
dvMean the average number of times a variable appears
dvVariance the variance in the number of times a variable appears
numCommunities total number of communities
numLeaves total number of leaf-communities
avgLeafDepth average leaf depth
depthMostLeaves the depth with the most leaf-communities
rootInterVars the number of inter-community variables at the root level
lvl2InterVars the average number of inter-community variables at depth 2
lvl3InterVars the average number of inter-community variables at depth 3
rootInterEdges the number of inter-community edges at the root level
lvl2InterEdges the average number of inter-community edges at depth 2
lvl3InterEdges the average number of inter-community edges at depth 3
rootDegree the number of communities at the root level
lvl2Degree the average number of communities at depth 2
lvl3Degree the average number of communities at depth 3
maxDegree the maximum degree over all levels
rootModularity the modularity of the graph at the root level
lvl2Modularity the average modularity at depth 2
lvl3Modularity the average modularity at depth 3
maxModularity the maximum modularity over all levels
rootMergeability the mergeability score between all variables
lvl2Mergeability the average mergeability score of a community at depth 2
lvl3Mergeability the average mergeability score of a community at depth 3
maxMergeability the maximum mergeability over all levels
lvl2CommunitySize the average number of variables in a community at depth 2
lvl3CommunitySize the average number of variables in a community at depth 3
leafCommunitySize the average number of variables in a leaf-community
numLeaves / numCommunities
rootInterEdges / rootInterVars
lvl2InterEdges / lvl2InterVars
lvl3InterEdges / lvl3InterVars
max(interEdges / interVars) the maximum interEdges / interVars ratio over all levels
rootInterEdges / rootCommunitySize
lvl2InterEdges / lvl2CommunitySize
lvl3InterEdges / lvl3CommunitySize
max(interEdges / communitySize) the maximum interEdges / communitySize ratio over all levels
rootInterVars / rootCommunitySize
lvl2InterVars / lvl2CommunitySize
lvl3InterVars / lvl3CommunitySize
max(interVars / communitySize) the maximum interVars / communitySize ratio over all levels
rootInterEdges / rootDegree
lvl2InterEdges / lvl2Degree
lvl3InterEdges / lvl3Degree
rootInterVars / rootDegree
lvl2InterVars / lvl2Degree
lvl3InterVars / lvl3Degree

Feature Clusters

In the following table, we list the representative feature and its parent cluster for predicting solving time and classification of an instance into its category.

Feature

Cluster

rootMergeability maxMergeability
maxInterEdges / CommunitySize maxInterEdges / InterVars
rootInterEdges
lvl2Mergeabilty
cvr dvVariance
leafCommunitySize
lvl3Modularity lvl2Degree, lvl3Degree
lvl2InterEdges / lvl2InterVars lvl2InterEdges / lvl2CommunitySize, lvl3InterEdges / lvl3InterVar, lvl3InterEdges/lvl3CommunitySize