Download this example as a Jupyter notebook or a Python script.


Writing Compliance Results to a DataFrame

Introduction

The BoM Analytics package presents compliance results in a hierarchical data structure. Alternatively, you can represent the data in a tabular data structure, where each row contains a reference to the parent row. This example shows how compliance data could be translated from one format to another, making use of a pandas.DataFrame object to store the tabulated data.

Perform a Compliance Query

The first step is to perform a compliance query on an assembly that will result in a deeply nested structure. The code here is presented without explanation. For more information, see the the Part Compliance Query example.

[1]:
from ansys.grantami.bomanalytics import Connection, indicators, queries

server_url = "http://my_grantami_server/mi_servicelayer"
cxn = Connection(server_url).with_credentials("user_name", "password").connect()
svhc = indicators.WatchListIndicator(
    name="SVHC",
    legislation_names=["REACH - The Candidate List"],
    default_threshold_percentage=0.1,
)
part_query = (
    queries.PartComplianceQuery()
    .with_record_history_ids([565060])
    .with_indicators([svhc])
)
part_result = cxn.run(part_query)

The part_result object contains the compliance result for every subitem. This is ideal for understanding compliance at a certain ‘level’ of the structure, For example, we can display the compliance for each item directly under the root part.

[2]:
for part in part_result.compliance_by_part_and_indicator[0].parts:
    print(
        f"Part ID: {part.record_history_identity}, "
        f"Compliance: {part.indicators['SVHC'].flag}"
    )
Part ID: 565020, Compliance: WatchListFlag.WatchListCompliant
Part ID: 565046, Compliance: WatchListFlag.WatchListCompliant
Part ID: 565010, Compliance: WatchListFlag.WatchListCompliant
Part ID: 564812, Compliance: WatchListFlag.WatchListCompliant
Part ID: 564929, Compliance: WatchListFlag.WatchListCompliant
Part ID: 565082, Compliance: WatchListFlag.WatchListCompliant
Part ID: 565040, Compliance: WatchListFlag.WatchListCompliant
Part ID: 564772, Compliance: WatchListFlag.WatchListCompliant
Part ID: 565037, Compliance: WatchListFlag.WatchListCompliant
Part ID: 565053, Compliance: WatchListFlag.WatchListCompliant
Part ID: 564831, Compliance: WatchListFlag.WatchListCompliant
Part ID: 564821, Compliance: WatchListFlag.WatchListHasSubstanceAboveThreshold

However, this structure makes it difficult to compare items at different levels. To do that, we want to flatten the data into a tabular structure.

Flatten the Hierarchical Data Structure

We will flatten the data into a list of dict objects, where each dict represents an item in the hierarchy, and each value in the dict represents a property of that item. This structure can then be used either directly or used to construct a DataFrame.

First, define a helper function to transform a ComplianceQueryResult object into a dict. In addition to storing properties that are intrinsic to the item (such as the ID, type, and SVHC result), we want to store structural information, such as the level of the item and the ID of its parent.

[3]:
def create_dict(item, item_type, level, parent_id):
    """Add a BoM item to a list"""
    item_id = item.record_history_identity
    indicator = item.indicators["SVHC"]
    row = {
        "Item": item_id,
        "Parent": parent_id,
        "Type": item_type,
        "SVHC": indicator,
        "Level": level,
    }
    return row

To help with the flattening process, we will also define a schema, which describes which child item types each item type can contain.

[4]:
schema = {
    "Part": ["Part", "Specification", "Material", "Substance"],
    "Specification": ["Specification", "Coating", "Material", "Substance"],
    "Material": ["Substance"],
    "Coating": ["Substance"],
    "Substance": [],
}

The function itself performs the flattening via a stack-based approach, where the children of the item currently being processed are iteratively added to the items_to_process stack. Since this stack is being both modified and iterated over, we must use a while loop and .pop() statement instead of a for loop.

The stack uses a special type of collection called a deque, which is similar to a list but is optimized for these sorts of stack-type use cases involving repeated calls to .pop() and .extend().

[5]:
from collections import deque


def flatten_bom(root_part):
    result = []  # List that will contain all dicts

    # The stack contains a deque of tuples: (item_object, item_type, level, parent_id)
    # First seed the stack with the root part
    items_to_process = deque([(root_part, "Part", 0, None)])

    while items_to_process:
        # Get the next item from the stack
        item_object, item_type, level, parent = items_to_process.pop()
        # Create the dict
        row = create_dict(item_object, item_type, level, parent)
        # Append it to the result list
        result.append(row)

        # Compute the properties for the child items
        item_id = item_object.record_history_identity
        child_items = schema[item_type]
        child_level = level + 1

        # Add the child items to the stack
        if "Part" in child_items:
            items_to_process.extend([(p, "Part", child_level, item_id)
                                     for p in item_object.parts])
        if "Specification" in child_items:
            items_to_process.extend([(s, "Specification", child_level, item_id)
                                     for s in item_object.specifications])
        if "Material" in child_items:
            items_to_process.extend([(m, "Material", child_level, item_id)
                                     for m in item_object.materials])
        if "Coating" in child_items:
            items_to_process.extend([(c, "Coating", child_level, item_id)
                                     for c in item_object.coatings])
        if "Substance" in child_items:
            items_to_process.extend([(s, "Substance", child_level, item_id)
                                     for s in item_object.substances])

    # When the stack is empty, the while loop exists. Return the result list.
    return result

Finally, call the function above against the results from the compliance query and use the list to create a DataFrame.

[6]:
import pandas as pd

data = flatten_bom(part_result.compliance_by_part_and_indicator[0])
df_full = pd.DataFrame(data)
print(f"{len(df_full)} rows")
df_full.head()
301 rows
[6]:
Item Parent Type SVHC Level
0 565060 None Part SVHC, WatchListHasSubstanceAboveThreshold 0
1 564821 565060 Part SVHC, WatchListHasSubstanceAboveThreshold 1
2 565088 564821 Part SVHC, WatchListHasSubstanceAboveThreshold 2
3 12476 565088 Material SVHC, WatchListCompliant 3
4 164846 565088 Specification SVHC, WatchListHasSubstanceAboveThreshold 3

Post-processing the DataFrame

Now that we have the data in a DataFrame, we can perform operations across all levels of the structure more easily. For example, we can delete all rows that are less than the ‘Above Threshold’ state, retaining only rows that are non-compliant. (Note that this reduces the number of rows significantly.)

[7]:
threshold = indicators.WatchListFlag.WatchListAboveThreshold
df_non_compliant = df_full.drop(df_full[df_full.SVHC < threshold].index)
print(f"{len(df_non_compliant)} rows")
df_non_compliant.head()
18 rows
[7]:
Item Parent Type SVHC Level
0 565060 None Part SVHC, WatchListHasSubstanceAboveThreshold 0
1 564821 565060 Part SVHC, WatchListHasSubstanceAboveThreshold 1
2 565088 564821 Part SVHC, WatchListHasSubstanceAboveThreshold 2
4 164846 565088 Specification SVHC, WatchListHasSubstanceAboveThreshold 3
5 83146 164846 Coating SVHC, WatchListHasSubstanceAboveThreshold 4