binaryai.client

Module Contents

Classes

BinaryAI

BinaryAI client used to interact with servers.

Attributes

SDK_VERSION

DEFAULT_SDK_NAME

DEFAULT_POLL_INTERVAL

DEFAULT_POLL_TIMEOUT

HEADER_REQUEST_SOURCE

DEFAULT_ENDPOINT

DEFAULT_LICENSE_SEPARATOR

binaryai.client.SDK_VERSION
binaryai.client.DEFAULT_SDK_NAME = 'PythonSDK'
binaryai.client.DEFAULT_POLL_INTERVAL = 2
binaryai.client.DEFAULT_POLL_TIMEOUT = 60
binaryai.client.HEADER_REQUEST_SOURCE
binaryai.client.DEFAULT_ENDPOINT = 'https://api.binaryai.cn/v1/endpoint'
binaryai.client.DEFAULT_LICENSE_SEPARATOR = ','
class binaryai.client.BinaryAI(*, secret_id: str | None = os.environ.get('BINARYAI_SECRET_ID'), secret_key: str | None = os.environ.get('BINARYAI_SECRET_KEY'), endpoint: str = os.environ.get('BINARYAI_ENDPOINT') or DEFAULT_ENDPOINT)[source]

Bases: object

BinaryAI client used to interact with servers. Users can receive upload, do analysis, and receive the detailed results by using this client. .. note:

Since the transport session/connection under the hood will be use one
at a time, this class is NOT THREAD SAFE.
upload(filepath: str | None = None, *, mem: bytes | None = None, hooks: Dict | None = None, sha256: str | None = None, md5: str | None = None) str[source]

Uploads a file to server.

At least one of following input should be not None: * File upload: fill filepath for the file to be upload on the disk * Memory upload: mem for the file to be upload in the memory

If you only have the hash, you can try to fill sha256 and md5, but the error FileRequiredError might be raised. Hash is ignored if file is already provided through filepath or `mem. When multiple hashes provided, only use sha256.

Memory upload, hash upload and hooks are experimental features. They might be changed without noticed.

Parameters:
  • filepath (Optional) – A pathname to a given file for file upload.

  • mem (Optional) – A byte buffer for a file in memory to be upload.

  • hooks (Optional) – A dict to modify arguments before certain operations.

  • sha256 (Optional) – A string for hash upload.

  • md5 (Optional) – A string for hash upload.

Returns:

A actual sha256 that server calculates and returns.

_reanalyze(sha256: str)[source]

Reanalyze target file.

Parameters:

sha256 – File sha256sum.

wait_until_analysis_done(sha256: str, timeout: int = DEFAULT_POLL_TIMEOUT, interval: int = DEFAULT_POLL_INTERVAL)[source]

Wait until having a latest stable result, by waiting for if all analysis on this file done. You can set the wait timeout in seconds. If no stable results available after timeout, a TimeoutError is raised.

If parts being waitied are failed instead of succeed, this function will not raise any exception. To get detailed info about status, call get_analyze_status.

For analyze in parallel, consider call this function in a seperate thread, since this function is wait by calling threaing.Event. This function’s implementation is a good reference of judging if a file is finished analyzing.

Parameters:
  • sha256 – File sha256 sum.

  • timeout (int) – maxium wait time in seconds. If negative, wait forever.

  • interval (int) – pool interval in seconds. Raise error if not positive.

get_analyze_status(sha256: str) Dict[source]

Return current state of each analyzers. Read API document about relationship between analyzer and results.

Parameters:

sha256 – File sha256sum.

get_sha256(md5: str) str[source]

Get file sha256 by its md5.

Parameters:

md5 – File md5 hash.

Returns:

File sha256sum.

Return type:

str

get_filenames(sha256: str) List[str][source]

Get all uploaded filenames for a given file.

Parameters:

sha256 – File sha256sum.

Returns:

A list of filenames.

Return type:

List[str]

get_mime_type(sha256: str) str[source]

Get MIME type for a given file.

Parameters:

sha256 – File sha256sum.

Returns:

MIME type string.

Return type:

str

get_size(sha256: str) int[source]

Get size in bytes of a given file.

Parameters:

sha256 – File sha256sum.

Returns:

File size in bytes.

Return type:

int

get_compressed_files(sha256: str) List[binaryai.compressed_file.CompressedFile][source]

Get a list of files inside a compressed file identified by a sha256.

Parameters:

sha256

File sha256sum.

Returns:

int: File size in bytes.

get_all_cve_names(sha256: str) List[str][source]

Get all CVE names for a given file.

Parameters:

sha256 – File sha256sum.

get_all_licenses(sha256: str) List[binaryai.license.License][source]

Get all licenses for a given file.

Parameters:

sha256 – File sha256sum.

Returns:

A list of license string.

Return type:

List[str]

get_all_license_short_names(sha256: str) List[str][source]

Get all license short names for a given file.

Parameters:

sha256 – File sha256sum.

Returns:

A list of license short names.

Return type:

List[str]

get_all_ascii_strings(sha256: str) List[str][source]

Get all ASCII strings for a given file.

Parameters:

sha256 – File sha256sum.

Returns:

A list of ASCII strings.

Return type:

List[str]

get_sca_result(sha256: str) List[binaryai.component.Component][source]

Get SCA result for a given file.

Parameters:

sha256 – File sha256sum.

Returns:

A list of software components.

Return type:

List[Component]

get_overview(sha256: str) Dict[str, str | int][source]

Fetch analysis overview from BinaryAI Beat server by file’s sha256.

Returns:

A key-value pair containing overview of the binary file

Fetch file download link by file’s sha256.

Returns:

A link can be used to download link later. The link might expire.

list_func_offset(sha256: str) List[int][source]

Fetch offsets of functions from analysis.

Returns:

list of function offset

list_funcs(sha256: str, batch_size: int = 32) Iterator[binaryai.function.Function][source]

Parses the list of functions and returns a Function instance containing the given function’s name, fileoffset, bytes, pseudocode and returns the list with a generator.

Parameters:
  • sha256 – File sha256sum.

  • batch_size – Batch size to get functions’ info

Returns:

Function Iterator

get_func_info(sha256: str, offset: int, with_embedding: bool = False) binaryai.function.Function[source]

Fetch detailed information about the given function identified by its offset address.

Params:

offset: offset address of desired function

Returns:

Function instance containing the given function’s name, fileoffset, bytes, pseudocode

get_funcs_info(sha256: str, offsets: List[int], batch_size: int = 32, with_embedding: bool = False) Iterator[binaryai.function.Function][source]

Fetch detailed information about the given functions identified by its offset address.

Params:

offsets: A list of offset addresses of desired functions batch_size: Batch size to get functions’ info.

Returns:

Function iterator

Raises:

ValueError – invalid batch size

_get_funcs_info(sha256: str, offsets: List[int], step: int = 32, with_embedding: bool = False) Iterator[binaryai.function.Function][source]

Get functions’ info in batches

get_func_match(sha256: str, offset: int) List[binaryai.function.MatchedFunction][source]

Match functions about the given function identified by its offset address.

Params:

offset: offset address of desired function

Returns:

a List containing 10 match results, every result is a Dict the contains score and pseudocode. The List is sorted by score from high to low

get_khash_info(sha256: str) tuple[bytes, str] | None[source]

Return the KHash of this file. See website for detailed introduction on KHash.

Returns:

KHash’s value and version. Only compare if version is same.

You are not expected to parse version.

Return type:

Optional[Tuple[bytes, str]]