快速入门

下面给出了使用SDK上传和分析文件,并获取文件分析结果的示例。

初始化

初始化SDK需要使用 Secret ID 和 Secret Key。如果你目前没有,请 联系我们 获取一个。

Secret ID 和 Secret Key 是访问API的**唯一凭据**,请妥善保存。我们建议将这些凭据放入环境变量中,而不是硬编码在代码里:

$ read BINARYAI_SECRET_ID
#(enter your secret id)
$ read BINARYAI_SECRET_KEY
#(enter your secret key)
$ export BINARYAI_SECRET_ID
$ export BINARYAI_SECRET_KEY

环境变量设置好后,SDK会自动读取它们。

要初始化SDK:

[1]:
# Uncomment to get more logs
# import logging
# logging.basicConfig(stream=sys.stdout, level=logging.INFO)
# logger = logging.getLogger("binaryai_sdk")

from binaryai import BinaryAI

bai = BinaryAI() # Initialize the client

赞!如果没有异常发生,客户端就初始化好了。

上传并分析文件

注意:如果文件过大或上传过于频繁,上传请求可能被拒绝。

现在你可以使用本地文件路径上传文件:

[2]:
 # if upload succeed, file hash is returned
sha256 = bai.upload("/bin/echo")

# wait until done. timeout=-1 means wait forever
bai.wait_until_analysis_done(sha256, timeout=-1)

print("analysis succeed")
analysis succeed

获得分析结果

调用SDK的各个方法,并给出文件的sha256,即可访问分析结果:

[3]:
bai.get_overview(sha256)
[3]:
{'fileType': 'ELF64',
 'machine': 'AMD64',
 'platform': 'LINUX',
 'endian': 'LITTLE_ENDIAN',
 'loader': 'x86:LE:64:default',
 'entryPoint': 1059200,
 'baseAddress': 1048576}
[4]:
funcs = bai.list_funcs(sha256)
for i, f in enumerate(funcs):
    print("[{}: {}]".format(i+1, f.name))
    if i > 10:
        break
[1: _DT_INIT]
[2: FUN_00102020]
[3: <EXTERNAL>::getenv]
[4: <EXTERNAL>::free]
[5: <EXTERNAL>::abort]
[6: <EXTERNAL>::__errno_location]
[7: <EXTERNAL>::strncmp]
[8: <EXTERNAL>::_exit]
[9: <EXTERNAL>::__fpending]
[10: <EXTERNAL>::textdomain]
[11: <EXTERNAL>::fclose]
[12: <EXTERNAL>::bindtextdomain]

或者,你也可以初始化一个文件对象并调用:

[5]:
from binaryai import BinaryAIFile
# This pair of hash is the same file
sha256 = "289616b59a145e2033baddb8a8a9b5a8fb01bdbba1b8cf9acadcdd92e6cc0562"
md5 = "c3366c6b688a5b5fa4451fec09930e06"
bai_file = BinaryAIFile(bai, md5=md5)
for component in bai_file.get_sca_result():
    print(component.name)
    print("----")
reptile
----
tsh
----

You can also get a file’s KHash, which can be used to compare similarities:

[6]:
from binaryai import BinaryAIFile

fileA = BinaryAIFile(bai, md5="346136457e1eb6eca44a06bb55f93284").get_khash_info()
fileB = BinaryAIFile(bai, sha256="841de34799fc46bf4b926559e4e7a70e0cc386050963978d5081595e9a280ae1").get_khash_info()
fileC = BinaryAIFile(bai, sha256="9b53a3936c8c4202e418c37cbadeaef7cc7471f6a6522f6ead1a19b31831f4a1").get_khash_info()
assert fileA[1] == fileB[1]
assert fileB[1] == fileC[1]

# calculate hamming distance
def khash_similarity(khash_a: str, khash_b: str):
    from scipy.spatial import distance
    khash_a, khash_b = list(bin(int(khash_a, 16))[2:]), list(bin(int(khash_b, 16))[2:])
    return 1 - distance.hamming(khash_a, khash_b)
print(f"A<->B: {khash_similarity(fileA[0].hex(), fileB[0].hex())}")
print(f"A<->C: {khash_similarity(fileA[0].hex(), fileC[0].hex())}")
print(f"B<->C: {khash_similarity(fileB[0].hex(), fileC[0].hex())}")

A<->B: 0.958984375
A<->C: 0.583984375
B<->C: 0.580078125

如上所示,只要给出文件的哈希,即可访问结果。

你可以查看SDK代码仓库中的 examples/ 文件夹,或浏览文档的其他部分来获取更多信息。