binaryai.client

Module Contents

Classes

BinaryAI

BinaryAI client used to interact with servers.

Attributes

SDK_VERSION

DEFAULT_SDK_NAME

DEFAULT_POLL_INTERVAL

DEFAULT_POLL_TIMEOUT

HEADER_REQUEST_SOURCE

DEFAULT_ENDPOINT

DEFAULT_LICENSE_SEPARATOR

binaryai.client.SDK_VERSION
binaryai.client.DEFAULT_SDK_NAME = 'PythonSDK'
binaryai.client.DEFAULT_POLL_INTERVAL = 2
binaryai.client.DEFAULT_POLL_TIMEOUT = 60
binaryai.client.HEADER_REQUEST_SOURCE
binaryai.client.DEFAULT_ENDPOINT = 'https://api.binaryai.cn/v1/endpoint'
binaryai.client.DEFAULT_LICENSE_SEPARATOR = ','
class binaryai.client.BinaryAI(*, secret_id: str | None = os.environ.get('BINARYAI_SECRET_ID'), secret_key: str | None = os.environ.get('BINARYAI_SECRET_KEY'), endpoint: str = os.environ.get('BINARYAI_ENDPOINT') or DEFAULT_ENDPOINT)[源代码]

Bases: object

BinaryAI client used to interact with servers. Users can receive upload, do analysis, and receive the detailed results by using this client. .. note:

Since the transport session/connection under the hood will be use one
at a time, this class is NOT THREAD SAFE.
upload(filepath: str | None = None, *, mem: bytes | None = None, hooks: Dict | None = None, sha256: str | None = None, md5: str | None = None, is_private: bool | None = True) str[源代码]

Uploads a file to server.

At least one of following input should be not None: * File upload: fill filepath for the file to be upload on the disk * Memory upload: mem for the file to be upload in the memory

If you only have the hash, you can try to fill sha256 and md5, but the error FileRequiredError might be raised. Hash is ignored if file is already provided through filepath or `mem. When multiple hashes provided, only use sha256.

Memory upload, hash upload and hooks are experimental features. They might be changed without noticed.

参数:
  • filepath (Optional) – A pathname to a given file for file upload.

  • mem (Optional) – A byte buffer for a file in memory to be upload.

  • hooks (Optional) – A dict to modify arguments before certain operations.

  • sha256 (Optional) – A string for hash upload.

  • md5 (Optional) – A string for hash upload.

返回:

A actual sha256 that server calculates and returns.

_reanalyze(sha256: str)[源代码]

Reanalyze target file.

参数:

sha256 – File sha256sum.

wait_until_analysis_done(sha256: str, timeout: int = DEFAULT_POLL_TIMEOUT, interval: int = DEFAULT_POLL_INTERVAL)[源代码]

Wait until having a latest stable result, by waiting for if all analysis on this file done. You can set the wait timeout in seconds. If no stable results available after timeout, a TimeoutError is raised.

If parts being waitied are failed instead of succeed, this function will not raise any exception. To get detailed info about status, call get_analyze_status.

For analyze in parallel, consider call this function in a seperate thread, since this function is wait by calling threaing.Event. This function’s implementation is a good reference of judging if a file is finished analyzing.

参数:
  • sha256 – File sha256 sum.

  • timeout (int) – maxium wait time in seconds. If negative, wait forever.

  • interval (int) – pool interval in seconds. Raise error if not positive.

get_analyze_status(sha256: str) Dict[源代码]

Return current state of each analyzers. Read API document about relationship between analyzer and results.

参数:

sha256 – File sha256sum.

get_sha256(md5: str) str[源代码]

Get file sha256 by its md5.

参数:

md5 – File md5 hash.

返回:

File sha256sum.

返回类型:

str

get_filenames(sha256: str) List[str][源代码]

Get all uploaded filenames for a given file.

参数:

sha256 – File sha256sum.

返回:

A list of filenames.

返回类型:

List[str]

get_mime_type(sha256: str) str[源代码]

Get MIME type for a given file.

参数:

sha256 – File sha256sum.

返回:

MIME type string.

返回类型:

str

get_size(sha256: str) int[源代码]

Get size in bytes of a given file.

参数:

sha256 – File sha256sum.

返回:

File size in bytes.

返回类型:

int

get_compressed_files(sha256: str) List[binaryai.compressed_file.CompressedFile][源代码]

Get a list of files inside a compressed file identified by a sha256.

参数:

sha256

File sha256sum.

Returns:

int: File size in bytes.

get_all_cve_names(sha256: str) List[str][源代码]

Get all CVE names for a given file.

参数:

sha256 – File sha256sum.

get_all_licenses(sha256: str) List[binaryai.license.License][源代码]

Get all licenses for a given file.

参数:

sha256 – File sha256sum.

返回:

A list of license string.

返回类型:

List[str]

get_all_license_short_names(sha256: str) List[str][源代码]

Get all license short names for a given file.

参数:

sha256 – File sha256sum.

返回:

A list of license short names.

返回类型:

List[str]

get_all_ascii_strings(sha256: str) List[str][源代码]

Get all ASCII strings for a given file.

参数:

sha256 – File sha256sum.

返回:

A list of ASCII strings.

返回类型:

List[str]

get_sca_result(sha256: str) List[binaryai.component.Component][源代码]

Get SCA result for a given file.

参数:

sha256 – File sha256sum.

返回:

A list of software components.

返回类型:

List[Component]

get_overview(sha256: str) Dict[str, str | int][源代码]

Fetch analysis overview from BinaryAI Beat server by file’s sha256.

返回:

A key-value pair containing overview of the binary file

Fetch file download link by file’s sha256.

返回:

A link can be used to download link later. The link might expire.

list_func_offset(sha256: str) List[int][源代码]

Fetch offsets of functions from analysis.

返回:

list of function offset

list_funcs(sha256: str, batch_size: int = 32) Iterator[binaryai.function.Function][源代码]

Parses the list of functions and returns a Function instance containing the given function’s name, fileoffset, bytes, pseudocode and returns the list with a generator.

参数:
  • sha256 – File sha256sum.

  • batch_size – Batch size to get functions’ info

返回:

Function Iterator

get_func_info(sha256: str, offset: int, with_embedding: bool = False) binaryai.function.Function[源代码]

Fetch detailed information about the given function identified by its offset address.

Params:

offset: offset address of desired function

返回:

Function instance containing the given function’s name, fileoffset, bytes, pseudocode

get_funcs_info(sha256: str, offsets: List[int], batch_size: int = 32, with_embedding: bool = False) Iterator[binaryai.function.Function][源代码]

Fetch detailed information about the given functions identified by its offset address.

Params:

offsets: A list of offset addresses of desired functions batch_size: Batch size to get functions’ info.

返回:

Function iterator

抛出:

ValueError – invalid batch size

_get_funcs_info(sha256: str, offsets: List[int], step: int = 32, with_embedding: bool = False) Iterator[binaryai.function.Function][源代码]

Get functions’ info in batches

get_func_match(sha256: str, offset: int) List[binaryai.function.MatchedFunction][源代码]

Match functions about the given function identified by its offset address.

Params:

offset: offset address of desired function

返回:

a List containing 10 match results, every result is a Dict the contains score and pseudocode. The List is sorted by score from high to low

get_khash_info(sha256: str) tuple[bytes, str] | None[源代码]

Return the KHash of this file. See website for detailed introduction on KHash.

返回:

KHash’s value and version. Only compare if version is same.

You are not expected to parse version.

返回类型:

Optional[Tuple[bytes, str]]

get_malware_probability(sha256: str) float | None[源代码]

Return the malware probability of this file. 0 usually mean a white file, while 1 mean the file is risky.

This is a experimental feature. This might be changed without noticed.

返回:

Probability of the file. None means no result is available.

返回类型:

Optional[float]