AI Music Detection

Getting Started¶

See our docs for more details, but a short summary is:

Set environment variables:

export VERMILLIO_SDK_CLIENT_ID=<your client id>
export VERMILLIO_SDK_CLIENT_SECRET=<your client secret>

Initialize a VermillioMusicAIDetect client.

from vermillio.sdk.music import VermillioMusicAIDetect
music = VermillioMusicAIDetect()

Running AI Detection¶

To run AI Detection on a source of audio, it needs to be accessible to Vermillio. The simplest way to do that is with a publicly available URL. Assuming you don't have audio files publicly available, and instead are on some sort of bucket storage (S3/GCS/etc), our suggestion is to generate temporary signed urls so that Vermillio can access the content.

If you have local files, you can use upload_and_load_results to skip the signed URL step entirely — see Streamlined Upload & Detect below.

Creating Sources¶

from vermillio.sdk.music import AIDetectExternalSource

def _s3_signed_url(bucket: str, path: str, expiration: float = 600) -> str:
    """
    Generate a signed url for the bucket/path on s3.
    Params:
        bucket (str): The name of the bucket
        path (str): The path of the object
        expiration (float): Expiration in seconds that the signed url will be valid for, default: 10 minutes
    Returns:
        str: Signed url to access the object s3://{bucket}/{path}.
    """
    import boto3
    from botocore.config import Config
    client = boto3.client('s3', config=Config(signature_version='s3v4'))

    return client.generate_presigned_url(
        ClientMethod='get_object',
        Params={
            'Bucket': bucket,
            'Key': path
        },
        ExpiresIn=expiration
    )

def _gs_signed_url(bucket: str, path: str, expiration: float = 600) -> str:
    """
    Generate a signed url for the bucket/path on gs.
    Params:
        bucket (str): The name of the bucket
        path (str): The path of the object
        expiration (float): Expiration in seconds that the signed url will be valid for, default: 10 minutes
    Returns:
        str: Signed url to access the object gs://{bucket}/{path}.
    """
    from google.cloud import storage
    import datetime

    client = storage.Client()
    return client.bucket(bucket).blob(path).generate_signed_url(
        version="v4",
        expiration=datetime.timedelta(seconds=expiration),
        method="GET"
    )

sources:list[AIDetectExternalSource] = []
for file in files:
    # if the file is not publicly accessible, generate a signed url, see examples above for s3/gs.
    # if you have a unique id you want to associate with this file, pass it in as source_id = "your source id"
    sources.append(AIDetectExternalSource(path = file, source_id = "your source id"))

Load Sources¶

loaded = music.load(sources)

Load Results¶

from vermillio.sdk.music import AIDetectPipelineResults
results: list[AIDetectPipelineResults] = []
for source in loaded:
    print(f"Loading results for source: {source.id} / {source.source_id}")
    results.append(music.results(source.id, wait=True))

Summarize Results¶

def _summary(result: AIDetectPipelineResults):
    if result.status != 'Succeeded' or not result.results:
        return f"Unsuccessful, status={result.status}"
    if not result.results.detections:
        return "Unknown, no determinations made."

    results:list[str] = []
    for d in result.results.detections:
        res = f"[{d.query_segment.start:0.2f}-{d.query_segment.end:0.2f}] {d.label}"
        if d.label != d.source_id:
            res += f" ({d.source_id})"
        res += f" {d.confidence:0.2f}"
        results.append(res)
    return ", ".join(results)

for result in results:
    print(f"{result.source_path} ({result.id}) :: {_summary(result)}")

Streamlined Upload & Detect¶

If your audio files are local, upload_and_load_results handles uploading, pipeline loading, and waiting for results in a single call — no signed URLs needed.

from vermillio.sdk.music import VermillioMusicAIDetect

music = VermillioMusicAIDetect()

file_paths = ["/path/to/track1.mp3", "/path/to/track2.mp3"]
titles     = ["Track 1",             "Track 2"]

results = music.upload_and_load_results(file_paths, titles)

for title, result in zip(titles, results):
    print(f"{title}: {result.model_dump()}")

wait=True is the default, so the call blocks until every source has finished processing. Pass wait=False to get back the source entries immediately and poll manually with music.results(source.id).

Example outputs:

Track 1: {'id': 'yDBbiWNdCITuPFXH', 'status': 'Succeeded', 'created_at': 1778017121.904996, 'updated_at': 1778017127.6257627, 'source_id': 'C2o11VkJxS6fxDIP', 'source_path': None, 'results': {'detections': [{'query_segment': {'start': 0.0, 'end': 19.173877551020407}, 'label': 'real_music', 'confidence': 0.9283246040344239, 'source_id': 'real_music'}]}}
Track 2: {'id': '5uIZiaU7kHdxn5Az', 'status': 'Succeeded', 'created_at': 1778017121.905108, 'updated_at': 1778017130.1764488, 'source_id': 'idUUdd8wB1ckblhs', 'source_path': None, 'results': {'detections': [{'query_segment': {'start': 0.0, 'end': 43.41657596371882}, 'label': 'real_music', 'confidence': 0.5549722438339483, 'source_id': 'real_music'}]}}