17.5.10 Classifying the Edges Using the Obtained Embeddings
You can use the obtained embeddings in downstream edge classification tasks.
The following code shows how you can train a multi-layer perceptron (MLP) classifier,
which takes the embeddings as input. It is assumed that the edge label information
is stored under the edge property labels
.
import pandas as pd
from sklearn.metrics import accuracy_score, make_scorer
from sklearn.model_selection import RepeatedStratifiedKFold, cross_val_score
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler
# prepare input data
edge_vectors_df = edge_vectors.to_pandas().astype({"edgeId": int})
edge_labels_df = pd.DataFrame([
{"edgeId": e.id, "labels": properties}
for e, properties in graph.get_edge_property("labels").get_values()
]).astype(int)
edge_vectors_with_labels_df = edge_vectors_df.merge(edge_labels_df, on="edgeId")
feature_columns = [c for c in edge_vectors_df.columns if c.startswith("embedding")]
x = edge_vectors_with_labels_df[feature_columns].to_numpy()
y = edge_vectors_with_labels_df["labels"].to_numpy()
scaler = StandardScaler()
x = scaler.fit_transform(x)
# define an MLP classifier
model = MLPClassifier(
hidden_layer_sizes=(6,),
learning_rate_init=0.05,
max_iter=2000,
random_state=42,
)
# define a metric and evaluate with cross-validation
cv = RepeatedStratifiedKFold(n_splits=5, n_repeats=3, random_state=42)
scorer = make_scorer(accuracy_score, greater_is_better=True)
scores = cross_val_score(model, x, y, scoring=scorer, cv=cv, n_jobs=-1)
Parent topic: Using the Unsupervised EdgeWise Algorithm