ðè«ææ å ±
- API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs
- EMNLP 2023 main
ðãã®è«æã®ããŒã¡ãã»ãŒãž
- Tool Callingã¿ã¹ã¯ã®ããŒã¿ã®æ§ç¯ã«ãããŠã倿§ãªãã¡ã€ã³ãå«ããããšãéèŠã§ããã
ðã©ãããåé¡ã«åãçµãã ã®ã
- LLMãå€éšããŒã«ã䜿çšããèœåãè©äŸ¡ããããã®ãã³ãããŒã¯ãæ§ç¯ãã
ð§âðãã®åé¡ã«åãçµãããšããªãéèŠãªã®ã
- LLMã®æ§èœã¯åŠç¿ããŒã¿ã«äŸåãããããææ°ã®æ å ±ãå¿çã«åæ ããããšãã§ããªã
- å€éšããŒã«ã䜿çšããããšã§ãææ°ã®æ å ±ã«å¯Ÿå¿ããããšãã§ããããLLMã®å€éšããŒã«ã®æ§èœè©äŸ¡ã¯ãããŠããªã
ð¡åé¡è§£æ±ºã«åããããŒã¢ã€ãã¢ã¯äœã
- LLMã®APIåŒã³åºãæ§èœãè©äŸ¡ããããã®ãã³ãããŒã¯ããŒã¿ã»ãããšåŠç¿ããŒã¿ãæ§ç¯ãã
- ãã³ãããŒã¯ããŒã¿ã»ããã®æ§ç¯
- LLMã®èœåãè©äŸ¡ããäžã§ãAPIã®åŒã³åºãåæ°ãšåŒã³åºãããšã®ã§ããAPIã®æ°ãåºæºã«ã¿ã¹ã¯ãæ§ç¯
- ã¿ã¹ã¯ã®åé¡ã¯ä»¥äžã®äžçš®é¡
- Call : äžå以äžã®APIåŒã³åºãã§ãAPIã®æ°ãå°ãªã
- Retrieve+Call : äžåã®APIåŒã³åºãã§ãAPIã®æ°ãå€ããLLMã«ã¯äœ¿çšã§ããAPIãäžããããªãã
- Plan+Retrieve+Call : è€æ°åã®åŒã³åºãã§ãAPIã®æ°ãå€ããLLMã«ã¯äœ¿çšã§ããAPIãäžããããªã
- ãã³ãããŒã¯ã«äœ¿çšãããAPIã¯ãå®éã«å®è£ ããŠãã(ããããæ¶ç©ºã®API)
- ã¢ãããŒã·ã§ã³ã¯äººæã§è¡ãããã«ããŠãã
- è©äŸ¡ææšã¯ãLLMãäœæããã¯ãšãªã®æ£è§£çãšAPIã®å¿çãåºã«çæããæç« ã®ROUGE-Lã¹ã³ã¢ã䜿çšããŠãã
- åŠç¿ããŒã¿ã»ããã®æ§ç¯
- ããŒã¿ã»ããã¯LLMãçšããŠäœæãããåæããŒã¿ã»ãããçšãã
- çæã¯äºã€ã®LLMãç¬ç«ããŠããŒã¿ãçæãã
- ãã«ã¹ã±ã¢ãªã©ã®ããŒã¿ã®ãã¡ã€ã³ãæå®ãã
- ãã¡ã€ã³ãåºã«APIãåæãããåææã«ã¯å®ããŒã¿ãäŸãšããŠäžããŠãã
- åæãããAPIãã©ã³ãã ãµã³ããªã³ã°ããã¯ãšãªãäœæãã
- APIã®ã¬ã¹ãã³ã¹ãçæãã
- ããŒã¿ã»ããã«æ²¿ãå 容ã«ãªã£ãŠããããè©äŸ¡ãããã£ã«ã¿ãŒãã
ðæ°ãã«åãã£ãããšã¯äœã
- Lynxãšããã¢ãã«ãææ¡ææ³ã«ããäœæãããããŒã¿ã»ãããçšããŠè©äŸ¡ãã
- åŠç¿ããããšã§ãLLMã®æ§èœãåäžããããšãåãã£ã
- åããããªããŒã¿ã«ãªã£ãŠãããªããåœç¶ãªæ°ããã
- ãšã©ãŒã®åŸåãšããŠãåŠç¿åã¯APIã®åŒã³åºããç¡ãã±ãŒã¹ãå€ãããåŠç¿åŸã¯APIã®é¢æ°åã®ééããŠããã±ãŒã¹ã«å€åãã
- ãã³ãããŒã¯ã«ã€ããŠã¯ãGPTãªã©ã®ã¢ãã«ãšæ¯èŒãããšãCallãäžçªç°¡åã§ãPlan+Retrieve+Callãé£ããåŸåããã
- ToolAlpacaãšæ¯èŒãããšãå°ãªãããŒã¿ã§åçã®æ§èœãåŸããã
- é«å質ãªããŒã¿ã§ãããšèšããã®ãïŒè©äŸ¡ããŒã¿ã«ãã£ãŠçµæãå€ãããã
âçåç¹ã¯äœã
- ãã³ãããŒã¯ãšåŠç¿ããŒã¿ãåããããªæ¹éã§äœæããããè©äŸ¡çµæãè¯ããªãã®ã¯åœããåã§ã¯ãšæã£ã
- ææ³èªäœã¯åèã«ãªããã