Data Models
This section is a collection of concepts or notations for describing the structure of the ScanCode.io Data Model and providing details about all fields included in the output files.
Project
- class scanpipe.models.Project
The Project encapsulates all analysis processing. Multiple analysis pipelines can be run on the same project.
- Parameters:
uuid (UUIDField) – Primary key: UUID
extra_data (JSONField) – Extra data. Optional mapping of extra data key/values.
created_date (DateTimeField) – Created date. Creation date for this project.
name (CharField) – Name. Name for this project.
slug (SlugField) – Slug
work_directory (CharField) – Work directory. Project work directory location.
is_archived (BooleanField) – Is archived. Archived projects cannot be modified anymore and are not displayed by default in project lists. Multiple levels of data cleanup may have happened during the archive operation.
notes (TextField) – Notes
settings (JSONField) – Settings
Relationship fields:
- Parameters:
labels (
TaggableManager
toTag
) – Tags. A comma-separated list of tags. (related name:project
)tagged_items (
GenericRelation
toUUIDTaggedItem
) – Tagged items (related name:+
)
Reverse relationships:
- Parameters:
projectmessages (Reverse
ForeignKey
fromProjectMessage
) – All projectmessages of this project (related name ofproject
)inputsources (Reverse
ForeignKey
fromInputSource
) – All inputsources of this project (related name ofproject
)runs (Reverse
ForeignKey
fromRun
) – All runs of this project (related name ofproject
)codebaseresources (Reverse
ForeignKey
fromCodebaseResource
) – All codebaseresources of this project (related name ofproject
)codebaserelations (Reverse
ForeignKey
fromCodebaseRelation
) – All codebaserelations of this project (related name ofproject
)discoveredpackages (Reverse
ForeignKey
fromDiscoveredPackage
) – All discoveredpackages of this project (related name ofproject
)discovereddependencies (Reverse
ForeignKey
fromDiscoveredDependency
) – All discovereddependencies of this project (related name ofproject
)webhooksubscriptions (Reverse
ForeignKey
fromWebhookSubscription
) – All webhooksubscriptions of this project (related name ofproject
)webhookdeliveries (Reverse
ForeignKey
fromWebhookDelivery
) – All webhookdeliveries of this project (related name ofproject
)
- add_downloads(downloads)
Move the given downloads to the current project’s input/ directory and adds the input_source for each entry.
- add_error(description='', model='', details=None, exception=None, object_instance=None)
Create an ERROR ProjectMessage record using for this project.
- add_info(description='', model='', details=None, exception=None, object_instance=None)
Create an INFO ProjectMessage record for this project.
- add_input_source(download_url='', filename='', is_uploaded=False, tag='')
Create a InputFile entry for the current project, given a download_url or a filename.
- add_message(severity, description='', model='', details=None, exception=None, object_instance=None)
Create a ProjectMessage record for this Project.
The
model
attribute can be provided as a string or as a Model class. Aresource
can be provided to keep track of the codebase resource that was analyzed when the error occurred.
- add_pipeline(pipeline_name, execute_now=False, selected_groups=None)
Create a new Run instance with the provided pipeline on the current project.
If execute_now is True, the pipeline task is created. on_commit() is used to postpone the task creation after the transaction is successfully committed. If there isn’t any active transactions, the callback will be executed immediately.
- add_upload(uploaded_file, tag='')
Write the given upload to the current project’s input/ directory and adds the input_source.
- add_uploads(uploads)
Write the given uploads to the current project’s input/ directory and adds the input_source for each entry.
- add_warning(description='', model='', details=None, exception=None, object_instance=None)
Create a WARNING ProjectMessage record for this project.
- add_webhook_subscription(**kwargs)
Create a new WebhookSubscription instance with the provided target_url for the current project.
- archive(remove_input=False, remove_codebase=False, remove_output=False)
Set the project is_archived field to True.
The remove_input, remove_codebase, and remove_output can be provided during the archive operation to delete the related work directories.
The project cannot be archived if one of its related run is queued or already running.
- clear_tmp_directory()
Delete the whole content of the tmp/ directory. This is called at the end of each pipeline Run, and it doesn’t store any content that might be needed for further processing in following pipeline Run.
- clone(clone_name, copy_inputs=False, copy_pipelines=False, copy_settings=False, copy_subscriptions=False, execute_now=False)
Clone this project using the provided
clone_name
as new project name.
- copy_input_from(input_location)
Copy the file at input_location to the current project’s input/ directory.
- delete(*args, **kwargs)
Delete the work_directory along project-related data in the database.
Delete all related object instances using the private _raw_delete model API. This bypass the objects collection, cascade deletions, and signals. It results in a much faster objects deletion, but it needs to be applied in the correct models order as the cascading event will not be triggered. Note that this approach is used in Django’s fast_deletes but the scanpipe models are cannot be fast-deleted as they have cascades and relations.
- get_codebase_config_directory()
Return the
.scancode
config directory if available in the codebase directory.
- get_enabled_settings()
Return the enabled settings with non-empty values.
- get_env(field_name=None)
Return the project environment loaded from the
scancode-config.yml
config file, when available, and overridden by thesettings
model field.field_name
can be provided to get a single entry from the env.
- get_env_from_config_file()
Return
env
dict loaded from thescancode-config.yml
config file.
- get_ignored_dependency_scopes_index()
Return a dictionary index of the
ignored_dependency_scopes
setting values defined in this Project env.
- get_ignored_vulnerabilities_set()
Return a set of
ignored_vulnerabilities
setting values defined in this Project env.
- get_input_config_file()
Return the
scancode-config.yml
file from the input/ directory or from the codebase/ immediate subdirectories.Priority order: 1. If a config file exists directly in the input/ directory, return it. 2. If exactly one config file exists in a codebase/ immediate subdirectory, return it. 3. If multiple config files are found in subdirectories, report an error.
- get_inputs_with_source()
Return an input list including the filename, download_url, and size data.
- get_latest_output(filename)
Return the latest output file with the “filename” prefix, for example “scancode-<timestamp>.json”.
- get_next_run()
Return the next non-executed Run instance assigned to current project.
- get_output_file_path(name, extension)
Return a crafted file path in the project output/ directory using given name and extension. The current date and time strings are added to the filename.
This method ensures the proper setup of the work_directory in case of a manual wipe and re-creates the missing pieces of the directory structure.
- get_output_files_info()
Return files form the output work directory including the name and size.
- get_resource(path)
Return the codebase resource present for a given path, or None the resource with that path does not exist. This path is relative to the scan location. This is same as the Codebase.get_resource() function.
- static get_root_content(directory)
Return a list of all files and directories of a given directory. Only the first level children will be listed.
- get_settings_as_yml()
Return the
settings
file content as yml, suitable for a config file.
- inputs(pattern='**/*', extensions=None)
Return all files and directories path of the input/ directory matching a given pattern. The default **/* pattern means “this directory and all subdirectories, recursively”. Use the * pattern to only list the root content. The returned paths can be limited to the provided list of
extensions
.
- move_input_from(input_location)
Move the file at input_location to the current project’s input/ directory.
- reset(keep_input=True)
Reset the project by deleting all related database objects and all work directories except the input directory—when the keep_input option is True.
- save(*args, **kwargs)
Save this project instance. The workspace directories are set up during project creation.
- setup_work_directory()
Create all the work_directory structure and skips if already existing.
- start_pipelines()
Start the next “not started” pipeline execution.
- walk_codebase_path()
Return files and directories path of the codebase/ directory recursively.
- write_input_file(file_object)
Write the provided file_object to the project’s input/ directory.
- WORK_DIRECTORIES = ['input', 'output', 'codebase', 'tmp']
- can_change_inputs
Return True until one pipeline run has started its execution on the project. Always return False when the project is archived.
- can_start_pipelines
Return True if at least one “not started” pipeline is assigned to this project and if no pipeline runs is currently “queued or running”. “not started”. Always return False when the project is archived.
- property codebase_path
Return the codebase directory as a Path instance.
- codebaserelations
Type: Reverse
ForeignKey
fromCodebaseRelation
All codebaserelations of this project (related name of
project
)
- codebaseresources
Type: Reverse
ForeignKey
fromCodebaseResource
All codebaseresources of this project (related name of
project
)
- created_date
Type:
DateTimeField
Created date. Creation date for this project.
- dependency_count
Return the number of dependencies related to this project.
- discovereddependencies
Type: Reverse
ForeignKey
fromDiscoveredDependency
All discovereddependencies of this project (related name of
project
)
- discoveredpackages
Type: Reverse
ForeignKey
fromDiscoveredPackage
All discoveredpackages of this project (related name of
project
)
- file_count
Return the number of file resources related to this project.
- file_in_package_count
Return the number of file resources in a package related to this project.
- file_not_in_package_count
Return the number of file resources not in a package related to this project.
- has_single_resource
Return True if we only have a single CodebaseResource associated to this project, False otherwise.
- ignored_dependency_scopes_index
Return the computed value of get_ignored_dependency_scopes_index. The value is only generated once and cached for further calls.
- ignored_vulnerabilities_set
Return the computed value of get_ignored_vulnerabilities_set. The value is only generated once and cached for further calls.
- property input_files
Return list of files’ relative paths in the input/ directory recursively.
- property input_path
Return the input directory as a Path instance.
- property input_root
Return a list of all files and directories of the input/ directory. Only the first level children will be listed.
- property input_sources
- inputsources
Type: Reverse
ForeignKey
fromInputSource
All inputsources of this project (related name of
project
)
- is_archived
Type:
BooleanField
Is archived. Archived projects cannot be modified anymore and are not displayed by default in project lists. Multiple levels of data cleanup may have happened during the archive operation.
- labels = <taggit.managers._TaggableManager object>
- message_count
Return the number of messages related to this project.
- property output_path
Return the output directory as a Path instance.
- property output_root
Return a list of all files and directories of the output/ directory. Only first level children will be listed.
- package_count
Return the number of packages related to this project.
- projectmessages
Type: Reverse
ForeignKey
fromProjectMessage
All projectmessages of this project (related name of
project
)
- relation_count
Return the number of relations related to this project.
- resource_count
Return the number of resources related to this project.
- runs
Type: Reverse
ForeignKey
fromRun
All runs of this project (related name of
project
)
- tagged_items
Type: Reverse
GenericRelation
fromProject
All + of this Label (related name of
tagged_items
)
- property tmp_path
Return the tmp directory as a Path instance.
- vulnerable_dependency_count
Return the number of vulnerable dependencies related to this project.
- vulnerable_package_count
Return the number of vulnerable packages related to this project.
- webhookdeliveries
Type: Reverse
ForeignKey
fromWebhookDelivery
All webhookdeliveries of this project (related name of
project
)
- webhooksubscriptions
Type: Reverse
ForeignKey
fromWebhookSubscription
All webhooksubscriptions of this project (related name of
project
)
- property work_path
Return the work_directory as a Path instance.
CodebaseResource
- class scanpipe.models.CodebaseResource
A project Codebase Resources are records of its code files and directories. Each record is identified by its path under the project workspace.
These model fields should be kept in line with commoncode.resource.Resource.
- Parameters:
id (AutoField) – Primary key: ID
md5 (CharField) – MD5. MD5 checksum hex-encoded, as in md5sum.
sha1 (CharField) – SHA1. SHA1 checksum hex-encoded, as in sha1sum.
sha256 (CharField) – SHA256. SHA256 checksum hex-encoded, as in sha256sum.
sha512 (CharField) – SHA512. SHA512 checksum hex-encoded, as in sha512sum.
extra_data (JSONField) – Extra data. Optional mapping of extra data key/values.
detected_license_expression (TextField) – Detected license expression. The license expression summarizing the license info for this resource, combined from all the license detections
detected_license_expression_spdx (TextField) – Detected license expression spdx. The detected license expression for this file, with SPDX license keys
license_detections (JSONField) – License detections. List of license detection details.
license_clues (JSONField) – License clues. List of license matches that are not proper detections and potentially just clues to licenses or likely false positives. Those are not included in computing the detected license expression for the resource.
percentage_of_license_text (FloatField) – Percentage of license text. Percentage of file words detected as license text or notice.
copyrights (JSONField) – Copyrights. List of detected copyright statements (and related detection details).
holders (JSONField) – Holders. List of detected copyright holders (and related detection details).
authors (JSONField) – Authors. List of detected authors (and related detection details).
emails (JSONField) – Emails. List of detected emails (and related detection details).
urls (JSONField) – Urls. List of detected URLs (and related detection details).
compliance_alert (CharField) – Compliance alert. Indicates how the license expression complies with provided policies.
is_legal (BooleanField) – Is legal. True if this file is likely a legal, license-related file such as a COPYING or LICENSE file.
is_manifest (BooleanField) – Is manifest. True if this file is likely a package manifest file such as a Maven pom.xml or an npm package.json
is_readme (BooleanField) – Is readme. True if this file is likely a README file.
is_top_level (BooleanField) – Is top level. True if this file is top-level file located either at the root of a package or in a well-known common location.
is_key_file (BooleanField) – Is key file. True if this file is top-level file and either a legal, readme or manifest file.
path (CharField) – Path. The full path value of a resource (file or directory) in the archive it is from.
rootfs_path (CharField) – Rootfs path. Path relative to some root filesystem root directory. Useful when working on disk images, docker images, and VM images.Eg.: “/usr/bin/bash” for a path of “tarball-extract/rootfs/usr/bin/bash”
status (CharField) – Status. Analysis status for this resource.
size (BigIntegerField) – Size. Size in bytes.
tag (CharField) – Tag
type (CharField) – Type. Type of this resource as one of: file, directory, symlink
name (CharField) – Name. File or directory name of this resource with its extension.
extension (CharField) – Extension. File extension for this resource (directories do not have an extension).
programming_language (CharField) – Programming language. Programming language of this resource if this is a code file.
mime_type (CharField) – Mime type. MIME type (aka. media type) for this resource. See https://en.wikipedia.org/wiki/Media_type
file_type (CharField) – File type. Descriptive file type for this resource.
is_binary (BooleanField) – Is binary
is_text (BooleanField) – Is text
is_archive (BooleanField) – Is archive
is_media (BooleanField) – Is media
package_data (JSONField) – Package data. List of Package data detected from this CodebaseResource
Relationship fields:
- Parameters:
project (
ForeignKey
toProject
) – Project (related name:codebaseresources
)
Reverse relationships:
- Parameters:
related_to (Reverse
ForeignKey
fromCodebaseRelation
) – All related to of this codebase resource (related name offrom_resource
)related_from (Reverse
ForeignKey
fromCodebaseRelation
) – All related from of this codebase resource (related name ofto_resource
)discovered_packages (Reverse
ManyToManyField
fromDiscoveredPackage
) – All discovered packages of this codebase resource (related name ofcodebase_resources
)declared_dependencies (Reverse
ForeignKey
fromDiscoveredDependency
) – All declared dependencies of this codebase resource (related name ofdatafile_resource
)
- class Type(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
List of CodebaseResource types.
- DIRECTORY = 'directory'
- FILE = 'file'
- SYMLINK = 'symlink'
- add_package(discovered_package)
Assign the discovered_package to this codebase_resource instance.
- as_spdx()
Return this CodebaseResource as an SPDX Package entry.
- children(codebase=None)
Return a QuerySet of direct children CodebaseResource objects using a database query on the current CodebaseResource path.
Paths are returned in lower-cased sorted path order to reflect the behavior of the commoncode.resource.Resource.children() https://github.com/aboutcode-org/commoncode/blob/main/src/commoncode/resource.py
codebase is not used in this context but required for compatibility with the commoncode.resource.VirtualCodebase class API.
- create_and_add_package(package_data)
Create a DiscoveredPackage instance using the package_data and assigns it to the current CodebaseResource instance.
Errors that may happen during the DiscoveredPackage creation are capture at this level, rather that in the DiscoveredPackage.create_from_data level, so resource data can be injected in the ProjectMessage record.
- classmethod create_from_data(project, resource_data)
Create and returns a DiscoveredPackage for a project from the package_data. If one of the values of the required fields is not available, a “ProjectMessage” is created instead of a new DiscoveredPackage instance.
- descendants()
Return a QuerySet of descendant CodebaseResource objects using a database query on the current CodebaseResource path. The current CodebaseResource is not included.
- get_compliance_alert_display(*, field=<django.db.models.CharField: compliance_alert>)
Shows the label of the
compliance_alert
. Seeget_FOO_display()
for more information.
- get_path_segments_with_subpath()
Return a list of path segment name along its subpath for this resource.
Such as:
[ ('root', 'root'), ('subpath', 'root/subpath'), ('file.txt', 'root/subpath/file.txt'), ]
- get_raw_url()
Return the URL to access the RAW content of the resource.
- get_spdx_types()
- get_type_display(*, field=<django.db.models.CharField: type>)
Shows the label of the
type
. Seeget_FOO_display()
for more information.
- has_parent()
Return True if this CodebaseResource has a parent CodebaseResource or False otherwise.
- parent(codebase=None)
Return the parent CodebaseResource object for this CodebaseResource or None.
codebase is not used in this context but required for compatibility with the commoncode.resource.Codebase class API.
- parent_path()
Return the parent path for this CodebaseResource or None.
- siblings(codebase=None)
Return a sequence of sibling Resource objects for this Resource or an empty sequence.
codebase is not used in this context but required for compatibility with the commoncode.resource.Codebase class API.
- walk(topdown=True)
Return all descendant Resources of the current Resource; does not include self.
Traverses the tree top-down, depth-first if topdown is True; otherwise traverses the tree bottom-up.
- compliance_alert
Type:
CharField
Compliance alert. Indicates how the license expression complies with provided policies.
Choices:
ok
warning
error
missing
- copyrights
Type:
JSONField
Copyrights. List of detected copyright statements (and related detection details).
- declared_dependencies
Type: Reverse
ForeignKey
fromDiscoveredDependency
All declared dependencies of this codebase resource (related name of
datafile_resource
)
- detected_license_expression
Type:
TextField
Detected license expression. The license expression summarizing the license info for this resource, combined from all the license detections
- detected_license_expression_spdx
Type:
TextField
Detected license expression spdx. The detected license expression for this file, with SPDX license keys
- discovered_packages
Type: Reverse
ManyToManyField
fromDiscoveredPackage
All discovered packages of this codebase resource (related name of
codebase_resources
)
- extension
Type:
CharField
Extension. File extension for this resource (directories do not have an extension).
- property file_content
Return the content of the current Resource file using TextCode utilities for optimal compatibility.
- property for_packages
Return the list of all discovered packages associated to this resource.
- holders
Type:
JSONField
Holders. List of detected copyright holders (and related detection details).
- is_archive
Type:
BooleanField
Is archive
- is_binary
Type:
BooleanField
Is binary
- property is_dir
Return True, if the resource is a directory.
- property is_file
Return True, if the resource is a file.
- is_key_file
Type:
BooleanField
Is key file. True if this file is top-level file and either a legal, readme or manifest file.
- is_legal
Type:
BooleanField
Is legal. True if this file is likely a legal, license-related file such as a COPYING or LICENSE file.
- is_manifest
Type:
BooleanField
Is manifest. True if this file is likely a package manifest file such as a Maven pom.xml or an npm package.json
- is_media
Type:
BooleanField
Is media
- is_readme
Type:
BooleanField
Is readme. True if this file is likely a README file.
- property is_symlink
Return True, if the resource is a symlink.
- is_text
Type:
BooleanField
Is text
- is_top_level
Type:
BooleanField
Is top level. True if this file is top-level file located either at the root of a package or in a well-known common location.
- license_clues
Type:
JSONField
License clues. List of license matches that are not proper detections and potentially just clues to licenses or likely false positives. Those are not included in computing the detected license expression for the resource.
- license_expression_field = 'detected_license_expression'
- property location
Return the location of the resource as a string.
- property location_path
Return the location of the resource as a Path instance.
- mime_type
Type:
CharField
Mime type. MIME type (aka. media type) for this resource. See https://en.wikipedia.org/wiki/Media_type
- property name_without_extension
Return the name of the resource without it’s extension.
- package_data
Type:
JSONField
Package data. List of Package data detected from this CodebaseResource
- path
Type:
CharField
Path. The full path value of a resource (file or directory) in the archive it is from.
- percentage_of_license_text
Type:
FloatField
Percentage of license text. Percentage of file words detected as license text or notice.
- programming_language
Type:
CharField
Programming language. Programming language of this resource if this is a code file.
- project
Type:
ForeignKey
toProject
Project (related name:
codebaseresources
)
Type: Reverse
ForeignKey
fromCodebaseRelation
All related from of this codebase resource (related name of
to_resource
)
Type: Reverse
ForeignKey
fromCodebaseRelation
All related to of this codebase resource (related name of
from_resource
)
- rootfs_path
Type:
CharField
Rootfs path. Path relative to some root filesystem root directory. Useful when working on disk images, docker images, and VM images.Eg.: “/usr/bin/bash” for a path of “tarball-extract/rootfs/usr/bin/bash”
- size
Type:
BigIntegerField
Size. Size in bytes.
- property spdx_id
DiscoveredPackage
- class scanpipe.models.DiscoveredPackage
A project’s Discovered Packages are records of the system and application packages discovered in the code under analysis. Each record is identified by its Package URL. Package URL is a fundamental effort to create informative identifiers for software packages, such as Debian, RPM, npm, Maven, or PyPI packages. See https://github.com/package-url for more details.
- Parameters:
id (AutoField) – Primary key: ID
type (CharField) – Type. A short code to identify the type of this package. For example: gem for a Rubygem, docker for a container, pypi for a Python Wheel or Egg, maven for a Maven Jar, deb for a Debian package, etc.
namespace (CharField) – Namespace. Package name prefix, such as Maven groupid, Docker image owner, GitHub user or organization, etc.
name (CharField) – Name. Name of the package.
version (CharField) – Version. Version of the package.
qualifiers (CharField) – Qualifiers. Extra qualifying data for a package such as the name of an OS, architecture, distro, etc.
subpath (CharField) – Subpath. Extra subpath within a package, relative to the package root.
md5 (CharField) – MD5. MD5 checksum hex-encoded, as in md5sum.
sha1 (CharField) – SHA1. SHA1 checksum hex-encoded, as in sha1sum.
sha256 (CharField) – SHA256. SHA256 checksum hex-encoded, as in sha256sum.
sha512 (CharField) – SHA512. SHA512 checksum hex-encoded, as in sha512sum.
extra_data (JSONField) – Extra data. Optional mapping of extra data key/values.
compliance_alert (CharField) – Compliance alert. Indicates how the license expression complies with provided policies.
affected_by_vulnerabilities (JSONField) – Affected by vulnerabilities
filename (CharField) – Filename. File name of a Resource sometimes part of the URI properand sometimes only available through an HTTP header.
primary_language (CharField) – Primary language. Primary programming language.
description (TextField) – Description. Description for this package. By convention the first line should be a summary when available.
release_date (DateField) – Release date. The date that the package file was created, or when it was posted to its original download source.
homepage_url (CharField) – Homepage URL. URL to the homepage for this package.
download_url (CharField) – Download URL. A direct download URL.
size (BigIntegerField) – Size. Size in bytes.
bug_tracking_url (CharField) – Bug tracking URL. URL to the issue or bug tracker for this package.
code_view_url (CharField) – Code view URL. a URL where the code can be browsed online.
vcs_url (CharField) – VCS URL. A URL to the VCS repository in the SPDX form of: “git”, “svn”, “hg”, “bzr”, “cvs”, https://github.com/nexb/scancode-toolkit.git@405aaa4b3 See SPDX specification “Package Download Location” at https://spdx.org/spdx-specification-21-web-version#h.49x2ik5
repository_homepage_url (CharField) – Repository homepage URL. URL to the page for this package in its package repository. This is typically different from the package homepage URL proper.
repository_download_url (CharField) – Repository download URL. Download URL to download the actual archive of code of this package in its package repository. This may be different from the actual download URL.
api_data_url (CharField) – API data URL. API URL to obtain structured data for this package such as the URL to a JSON or XML api its package repository.
copyright (TextField) – Copyright. Copyright statements for this package. Typically one per line.
holder (TextField) – Holder. Holders for this package. Typically one per line.
declared_license_expression (TextField) – Declared license expression. The license expression for this package typically derived from its extracted_license_statement or from some other type-specific routine or convention.
declared_license_expression_spdx (TextField) – Declared license expression spdx. The SPDX license expression for this package converted from its declared_license_expression.
license_detections (JSONField) – License detections. A list of LicenseDetection mappings typically derived from its extracted_license_statement or from some other type-specific routine or convention.
other_license_expression (TextField) – Other license expression. The license expression for this package which is different from the declared_license_expression, (i.e. not the primary license) routine or convention.
other_license_expression_spdx (TextField) – Other license expression spdx. The other SPDX license expression for this package converted from its other_license_expression.
other_license_detections (JSONField) – Other license detections. A list of LicenseDetection mappings which is different from the declared_license_expression, (i.e. not the primary license) These are detections for the detection for the license expressions in other_license_expression.
extracted_license_statement (TextField) – Extracted license statement. The license statement mention, tag or text as found in a package manifest and extracted. This can be a string, a list or dict of strings possibly nested, as found originally in the manifest.
notice_text (TextField) – Notice text. A notice text for this package.
is_private (BooleanField) – Is private. True if this is a private package, either not meant to be published on a repository, and/or a local package without a name and version used primarily to track dependencies and other information.
is_virtual (BooleanField) – Is virtual. True if this package is created only from a manifest or lockfile, and not from its actual packaged code. The files of this package are not present in the codebase.
datasource_ids (JSONField) – Datasource ids. The identifiers for the datafile handlers used to obtain this package.
datafile_paths (JSONField) – Datafile paths. A list of Resource paths for package datafiles which were used to assemble this pacakage.
file_references (JSONField) – File references. List of file paths and details for files referenced in a package manifest. These may not actually exist on the filesystem. The exact semantics and base of these paths is specific to a package type or datafile format.
parties (JSONField) – Parties. A list of parties such as a person, project or organization.
uuid (UUIDField) – UUID
missing_resources (JSONField) – Missing resources
modified_resources (JSONField) – Modified resources
package_uid (CharField) – Package uid. Unique identifier for this package.
keywords (JSONField) – Keywords
notes (TextField) – Notes
source_packages (JSONField) – Source packages
tag (CharField) – Tag
Relationship fields:
- Parameters:
project (
ForeignKey
toProject
) – Project (related name:discoveredpackages
)codebase_resources (
ManyToManyField
toCodebaseResource
) – Codebase resources (related name:discovered_packages
)children_packages (
ManyToManyField
toDiscoveredPackage
) – Children packages (related name:parent_packages
)
Reverse relationships:
- Parameters:
parent_packages (Reverse
ManyToManyField
fromDiscoveredPackage
) – All parent packages of this discovered package (related name ofchildren_packages
)declared_dependencies (Reverse
ForeignKey
fromDiscoveredDependency
) – All declared dependencies of this discovered package (related name offor_package
)resolved_from_dependencies (Reverse
ForeignKey
fromDiscoveredDependency
) – All resolved from dependencies of this discovered package (related name ofresolved_to_package
)
- add_resources(codebase_resources)
Assign the codebase_resources to this discovered_package instance.
- as_cyclonedx()
Return this DiscoveredPackage as an CycloneDX Component entry.
- as_spdx()
Return this DiscoveredPackage as an SPDX Package entry.
- classmethod clean_data(data)
Return the data dict keeping only entries for fields available in the model.
- classmethod create_from_data(project, package_data)
Create and return a DiscoveredPackage for a given project based on package_data.
If the required name field is missing in package_data, a ProjectMessage is created instead of a DiscoveredPackage instance.
If the type field is missing in package_data, it defaults to “unknown” before creating the DiscoveredPackage.
- classmethod extract_purl_data(package_data)
- get_compliance_alert_display(*, field=<django.db.models.CharField: compliance_alert>)
Shows the label of the
compliance_alert
. Seeget_FOO_display()
for more information.
- get_declared_license_expression()
Return this package license expression.
Use declared_license_expression when available or compute the expression from declared_license_expression_spdx.
- get_declared_license_expression_spdx()
Return this package license expression using SPDX keys.
Use declared_license_expression_spdx when available or compute the expression from declared_license_expression.
- api_data_url
Type:
CharField
API data URL. API URL to obtain structured data for this package such as the URL to a JSON or XML api its package repository.
- bug_tracking_url
Type:
CharField
Bug tracking URL. URL to the issue or bug tracker for this package.
- children_packages
Type:
ManyToManyField
toDiscoveredPackage
Children packages (related name:
parent_packages
)
- codebase_resources
Type:
ManyToManyField
toCodebaseResource
Codebase resources (related name:
discovered_packages
)
- compliance_alert
Type:
CharField
Compliance alert. Indicates how the license expression complies with provided policies.
Choices:
ok
warning
error
missing
- copyright
Type:
TextField
Copyright. Copyright statements for this package. Typically one per line.
- property cyclonedx_bom_ref
Use the package_uid when available to ensure having unique bom_ref in the SBOM when several instances of the same DiscoveredPackage (i.e. same purl) are present in the project.
- datafile_paths
Type:
JSONField
Datafile paths. A list of Resource paths for package datafiles which were used to assemble this pacakage.
- datasource_ids
Type:
JSONField
Datasource ids. The identifiers for the datafile handlers used to obtain this package.
- declared_dependencies
Type: Reverse
ForeignKey
fromDiscoveredDependency
All declared dependencies of this discovered package (related name of
for_package
)
- declared_license_expression
Type:
TextField
Declared license expression. The license expression for this package typically derived from its extracted_license_statement or from some other type-specific routine or convention.
- declared_license_expression_spdx
Type:
TextField
Declared license expression spdx. The SPDX license expression for this package converted from its declared_license_expression.
- description
Type:
TextField
Description. Description for this package. By convention the first line should be a summary when available.
- extracted_license_statement
Type:
TextField
Extracted license statement. The license statement mention, tag or text as found in a package manifest and extracted. This can be a string, a list or dict of strings possibly nested, as found originally in the manifest.
- file_references
Type:
JSONField
File references. List of file paths and details for files referenced in a package manifest. These may not actually exist on the filesystem. The exact semantics and base of these paths is specific to a package type or datafile format.
- filename
Type:
CharField
Filename. File name of a Resource sometimes part of the URI properand sometimes only available through an HTTP header.
- is_private
Type:
BooleanField
Is private. True if this is a private package, either not meant to be published on a repository, and/or a local package without a name and version used primarily to track dependencies and other information.
- is_virtual
Type:
BooleanField
Is virtual. True if this package is created only from a manifest or lockfile, and not from its actual packaged code. The files of this package are not present in the codebase.
- license_detections
Type:
JSONField
License detections. A list of LicenseDetection mappings typically derived from its extracted_license_statement or from some other type-specific routine or convention.
- license_expression_field = 'declared_license_expression'
- namespace
Type:
CharField
Namespace. Package name prefix, such as Maven groupid, Docker image owner, GitHub user or organization, etc.
- other_license_detections
Type:
JSONField
Other license detections. A list of LicenseDetection mappings which is different from the declared_license_expression, (i.e. not the primary license) These are detections for the detection for the license expressions in other_license_expression.
- other_license_expression
Type:
TextField
Other license expression. The license expression for this package which is different from the declared_license_expression, (i.e. not the primary license) routine or convention.
- other_license_expression_spdx
Type:
TextField
Other license expression spdx. The other SPDX license expression for this package converted from its other_license_expression.
- parent_packages
Type: Reverse
ManyToManyField
fromDiscoveredPackage
All parent packages of this discovered package (related name of
children_packages
)
- project
Type:
ForeignKey
toProject
Project (related name:
discoveredpackages
)
- property purl
Return the Package URL.
- qualifiers
Type:
CharField
Qualifiers. Extra qualifying data for a package such as the name of an OS, architecture, distro, etc.
- release_date
Type:
DateField
Release date. The date that the package file was created, or when it was posted to its original download source.
- repository_download_url
Type:
CharField
Repository download URL. Download URL to download the actual archive of code of this package in its package repository. This may be different from the actual download URL.
- repository_homepage_url
Type:
CharField
Repository homepage URL. URL to the page for this package in its package repository. This is typically different from the package homepage URL proper.
- resolved_from_dependencies
Type: Reverse
ForeignKey
fromDiscoveredDependency
All resolved from dependencies of this discovered package (related name of
resolved_to_package
)
- resources
Return the assigned codebase_resources QuerySet as a list.
- size
Type:
BigIntegerField
Size. Size in bytes.
- property spdx_id
- type
Type:
CharField
Type. A short code to identify the type of this package. For example: gem for a Rubygem, docker for a container, pypi for a Python Wheel or Egg, maven for a Maven Jar, deb for a Debian package, etc.
- vcs_url
Type:
CharField
VCS URL. A URL to the VCS repository in the SPDX form of: “git”, “svn”, “hg”, “bzr”, “cvs”, https://github.com/nexb/scancode-toolkit.git@405aaa4b3 See SPDX specification “Package Download Location” at https://spdx.org/spdx-specification-21-web-version#h.49x2ik5
DiscoveredDependency
- class scanpipe.models.DiscoveredDependency
A project’s Discovered Dependencies are records of the dependencies used by system and application packages discovered in the code under analysis. Dependencies are usually collected from parsed package data such as a package manifest or lockfile.
- Parameters:
id (AutoField) – Primary key: ID
type (CharField) – Type. A short code to identify the type of this package. For example: gem for a Rubygem, docker for a container, pypi for a Python Wheel or Egg, maven for a Maven Jar, deb for a Debian package, etc.
namespace (CharField) – Namespace. Package name prefix, such as Maven groupid, Docker image owner, GitHub user or organization, etc.
name (CharField) – Name. Name of the package.
version (CharField) – Version. Version of the package.
qualifiers (CharField) – Qualifiers. Extra qualifying data for a package such as the name of an OS, architecture, distro, etc.
subpath (CharField) – Subpath. Extra subpath within a package, relative to the package root.
affected_by_vulnerabilities (JSONField) – Affected by vulnerabilities
dependency_uid (CharField) – Dependency uid. The unique identifier of this dependency.
extracted_requirement (CharField) – Extracted requirement. The version requirements of this dependency.
scope (CharField) – Scope. The scope of this dependency, how it is used in a project.
datasource_id (CharField) – Datasource id. The identifier for the datafile handler used to obtain this dependency.
is_runtime (BooleanField) – Is runtime. True if this dependency is a runtime dependency.
is_optional (BooleanField) – Is optional. True if this dependency is an optional dependency
is_resolved (BooleanField) – Is resolved. True if this dependency version requirement has been pinned and this dependency points to an exact version.
is_direct (BooleanField) – Is direct. True if this is a direct, first-level dependency relationship for a package.
Relationship fields:
- Parameters:
project (
ForeignKey
toProject
) – Project (related name:discovereddependencies
)for_package (
ForeignKey
toDiscoveredPackage
) – For package. The package that declares this dependency. (related name:declared_dependencies
)resolved_to_package (
ForeignKey
toDiscoveredPackage
) – Resolved to package. The resolved package for this dependency. If empty, it indicates the dependency is unresolved. (related name:resolved_from_dependencies
)datafile_resource (
ForeignKey
toCodebaseResource
) – Datafile resource. The codebase resource (e.g., manifest or lockfile) that declares this dependency. (related name:declared_dependencies
)
- as_spdx()
Return this Dependency as an SPDX Package entry.
- classmethod create_from_data(project, dependency_data, for_package=None, resolved_to_package=None, datafile_resource=None, datasource_id=None, strip_datafile_path_root=False)
Create and returns a DiscoveredDependency for a project from the dependency_data.
If strip_datafile_path_root is True, then create_from_data() will strip the root path segment from the datafile_path of dependency_data before looking up the corresponding CodebaseResource for datafile_path. This is used in the case where Dependency data is imported from a scancode-toolkit scan, where the root path segments are not stripped for datafile_path.
- classmethod extract_purl_data(dependency_data, ignore_nulls=False)
- classmethod populate_dependency_uuid(dependency_data)
- property base_purl
- datafile_path
- datafile_resource
Type:
ForeignKey
toCodebaseResource
Datafile resource. The codebase resource (e.g., manifest or lockfile) that declares this dependency. (related name:
declared_dependencies
)
- datafile_resource_id
Internal field, use
datafile_resource
instead.
- datasource_id
Type:
CharField
Datasource id. The identifier for the datafile handler used to obtain this dependency.
- extracted_requirement
Type:
CharField
Extracted requirement. The version requirements of this dependency.
- for_package
Type:
ForeignKey
toDiscoveredPackage
For package. The package that declares this dependency. (related name:
declared_dependencies
)
- for_package_id
Internal field, use
for_package
instead.
- for_package_uid
- is_direct
Type:
BooleanField
Is direct. True if this is a direct, first-level dependency relationship for a package.
- is_optional
Type:
BooleanField
Is optional. True if this dependency is an optional dependency
- is_resolved
Type:
BooleanField
Is resolved. True if this dependency version requirement has been pinned and this dependency points to an exact version.
- is_runtime
Type:
BooleanField
Is runtime. True if this dependency is a runtime dependency.
- namespace
Type:
CharField
Namespace. Package name prefix, such as Maven groupid, Docker image owner, GitHub user or organization, etc.
- property package_type
- project
Type:
ForeignKey
toProject
Project (related name:
discovereddependencies
)
- property purl
- qualifiers
Type:
CharField
Qualifiers. Extra qualifying data for a package such as the name of an OS, architecture, distro, etc.
- resolved_to_package
Type:
ForeignKey
toDiscoveredPackage
Resolved to package. The resolved package for this dependency. If empty, it indicates the dependency is unresolved. (related name:
resolved_from_dependencies
)
- resolved_to_package_id
Internal field, use
resolved_to_package
instead.
- resolved_to_package_uid
- property spdx_id
CodebaseRelation
- class scanpipe.models.CodebaseRelation
Relation between two CodebaseResource.
- Parameters:
Relationship fields:
- Parameters:
project (
ForeignKey
toProject
) – Project (related name:codebaserelations
)from_resource (
ForeignKey
toCodebaseResource
) – From resource (related name:related_to
)to_resource (
ForeignKey
toCodebaseResource
) – To resource (related name:related_from
)
- from_resource
Type:
ForeignKey
toCodebaseResource
From resource (related name:
related_to
)
- from_resource_id
Internal field, use
from_resource
instead.
- project
Type:
ForeignKey
toProject
Project (related name:
codebaserelations
)
- property score
- property status
- to_resource
Type:
ForeignKey
toCodebaseResource
To resource (related name:
related_from
)
- to_resource_id
Internal field, use
to_resource
instead.
ProjectMessage
- class scanpipe.models.ProjectMessage
Stores messages such as errors and exceptions raised during a pipeline run.
- Parameters:
uuid (UUIDField) – Primary key: UUID
severity (CharField) – Severity. Severity level of the message.
description (TextField) – Description. Description.
model (CharField) – Model. Name of the model class.
details (JSONField) – Details. Data that caused the error.
traceback (TextField) – Traceback. Exception traceback.
created_date (DateTimeField) – Created date
Relationship fields:
- Parameters:
project (
ForeignKey
toProject
) – Project (related name:projectmessages
)
- class Severity(value, names=None, *, module=None, qualname=None, type=None, start=1, boundary=None)
- ERROR = 'error'
- INFO = 'info'
- WARNING = 'warning'
- get_severity_display(*, field=<django.db.models.CharField: severity>)
Shows the label of the
severity
. Seeget_FOO_display()
for more information.
- created_date
Type:
DateTimeField
Created date
- project
Type:
ForeignKey
toProject
Project (related name:
projectmessages
)
Run
- class scanpipe.models.Run
The Database representation of a pipeline execution.
- Parameters:
uuid (UUIDField) – Primary key: UUID
task_id (UUIDField) – Task id
task_start_date (DateTimeField) – Task start date
task_end_date (DateTimeField) – Task end date
task_exitcode (IntegerField) – Task exitcode
task_output (TextField) – Task output
log (TextField) – Log
pipeline_name (CharField) – Pipeline name. Identify a registered Pipeline class.
created_date (DateTimeField) – Created date
scancodeio_version (CharField) – Scancodeio version
description (TextField) – Description
current_step (CharField) – Current step
selected_groups (JSONField) – Selected groups
selected_steps (JSONField) – Selected steps
Relationship fields:
- Parameters:
project (
ForeignKey
toProject
) – Project (related name:runs
)
Reverse relationships:
- Parameters:
webhook_deliveries (Reverse
ForeignKey
fromWebhookDelivery
) – All webhook deliveries of this run (related name ofrun
)
- deliver_project_subscriptions(has_next_run=False)
Triggers related project Webhook subscriptions.
- execute_task_async()
Enqueues the pipeline execution task for an asynchronous execution.
- get_diff_url()
Return a GitHub diff URL between this Run commit at the time of execution and the current commit of the ScanCode.io app instance. The URL is only returned if both commit are available and if they differ.
- get_previous_runs()
Return all the previous Run instances regardless of their status.
- make_pipeline_instance()
Return a pipelines instance using this Run pipeline_class.
- profile(print_results=False)
Return computed execution times for each step in the current Run.
If print_results is provided, the results are printed to stdout.
- set_current_step(message)
Set the
message
value on thecurrent_step
field. Truncate the value at 256 characters.
- set_scancodeio_version()
Set the current ScanCode.io version on the
scancodeio_version
field.
- start()
Start the pipeline execution when allowed or raised an exception.
- sync_with_job()
Synchronise this Run instance with its related RQ Job. This is required when a Run gets out of sync with its Job, this can happen when the worker or one of its processes is killed, the Run status is not properly updated and may stay in a Queued or Running state forever. In case the Run is out of sync of its related Job, the Run status will be updated accordingly. When the run was in the queue, it will be enqueued again.
- property can_start
Return True if this Run is allowed to start its execution.
Run are not allowed to start when any of their previous Run instances within the pipeline has not completed (not started, queued, or running). This is enforced to ensure the pipelines are run in a sequential order.
- created_date
Type:
DateTimeField
Created date
- property pipeline_class
Return this Run pipeline_class.
- project
Type:
ForeignKey
toProject
Project (related name:
runs
)
- property results_url
Return the rendered
results_url
if defined on the Pipeline class.
- task_end_date
Type:
DateTimeField
Task end date
- task_exitcode
Type:
IntegerField
Task exitcode
- task_start_date
Type:
DateTimeField
Task start date
- webhook_deliveries
Type: Reverse
ForeignKey
fromWebhookDelivery
All webhook deliveries of this run (related name of
run
)