WorldDigitalTechnologyAcademy(WDTA) LargeLanguageModelSecurity TestingMethod WorldDigitalTechnologyAcademyStandard WDTAAI-STR-02 Edition:2024-04©WDTA2024–Allrightsreserved. TheWorldDigitalTechnologyStandardWDTAAI-STR-02isdesignatedasaWDTA norm.ThisdocumentisthepropertyoftheWorldDigitalTechnologyAcademy(WDTA)andis protectedbyinternationalcopyrightlaws.Anyuseofthisdocument,includingreproduction, modification,distribution,orrepublication,withoutthepriorwrittenpermissionofWDTA,is prohibited.WDTAisnotliableforanyerrorsoromissionsinthisdocument. DiscovermoreWDTAstandardandrelatedpublicationsathttps://wdtacademy.org/. VersionHistory* StandardID Version Date Changes WDTAAI-STR-02 1.0 2024-04 InitialReleaseForeword The"LargeLanguageModelSecurityTestingMethod,"developedandissuedbytheWorldDigital TechnologyAcademy(WDTA),representsacrucialadvancementinourongoingcommitmentto ensuringtheresponsibleandsecureuseofartificialintelligencetechnologies.AsAIsystems, particularlylargelanguagemodels,continuetobecomeincreasinglyintegraltovariousaspectsof society,theneedforacomprehensivestandardtoaddresstheirsecuritychallengesbecomes paramount.Thisstandard,anintegralpartofWDTA'sAISTR(Safety,Trust,Responsibility)program, isspecificallydesignedtotacklethecomplexitiesinherentinlargelanguagemodelsandprovide rigorousevaluationmetricsandprocedurestotesttheirresilienceagainstadversarialattacks. Thisstandarddocumentprovidesaframeworkforevaluatingtheresilienceoflargelanguagemodels (LLMs)againstadversarialattacks.TheframeworkappliestothetestingandvalidationofLLMs acrossvariousattackclassifications,includingL1Random,L2Blind-Box,L3Black-Box,andL4 White-Box.KeymetricsusedtoassesstheeffectivenessoftheseattacksincludetheAttackSuccess Rate(R)andDeclineRate(D).Thedocumentoutlinesadiverserangeofattackmethodologies,such asinstructionhijackingandpromptmasking,tocomprehensivelytesttheLLMs'resistanceto differenttypesofadversarialtechniques.Thetestingproceduredetailedinthisstandarddocument aimstoestablishastructuredapproachforevaluatingtherobustnessofLLMsagainstadversarial attacks,enablingdevelopersandorganizationstoidentifyandmitigatepotentialvulnerabilities,and ultimatelyimprovethesecurityandreliabilityofAIsystemsbuiltusingLLMs. Byestablishingthe"LargeLanguageModelSecurityTestingMethod,"WDTAseekstoleadtheway increatingadigitalecosystemwhereAIsystemsarenotonlyadvancedbutalsosecureandethically aligned.Itsymbolizesourdedicationtoafuturewheredigitaltechnologiesaredevelopedwithakeen senseoftheirsocietalimplicationsandareleveragedforthegreaterbenefitofall. ExecutiveChairmanofWDTA

.pdf文档 WDTA AI-STR-02-LLM security Large Language Model Security

文档预览
中文文档 22 页 50 下载 1000 浏览 0 评论 309 收藏 3.0分
温馨提示:本文档共22页,可预览 3 页,如浏览全部内容或当前文档出现乱码,可开通会员下载原始文档
WDTA AI-STR-02-LLM security Large Language Model Security 第 1 页 WDTA AI-STR-02-LLM security Large Language Model Security 第 2 页 WDTA AI-STR-02-LLM security Large Language Model Security 第 3 页
下载文档到电脑,方便使用
本文档由 人生无常2024-05-12 13:16:23上传分享
给文档打分
您好可以输入 255 个字符
网站域名是多少( 答案:github5.com )
评论列表
  • 暂时还没有评论,期待您的金玉良言