[{"data":1,"prerenderedAt":205},["ShallowReactive",2],{"DlFXI4Eibt_Bn9lrEZz1TYbHCWFZj3IvqwHQSEW-Exc":3,"61YTsyjmN0YXKhmXZdkxQf2sPO3EgMAcdm1vdPpFNfE":194},{"code":4,"msg":5,"data":6},0,"",{"category":7,"tag":11,"hot":39,"new":78,"banner":118,"data":143,"cache":193},[8,9,10],"Agent","OpenAI","LLM",[12,14,17,20,23,25,27,30,33,36],{"title":8,"total":13},39,{"title":15,"total":16},"Google",44,{"title":18,"total":19},"Nvidia",13,{"title":21,"total":22},"Claude",11,{"title":9,"total":24},35,{"title":10,"total":26},85,{"title":28,"total":29},"DeepSeek",9,{"title":31,"total":32},"OCR",1,{"title":34,"total":35},"Chat",7,{"title":37,"total":38},"Generator",116,[40,48,55,64,71],{"id":41,"publish_date":42,"is_original":4,"collection":5,"cover_url":43,"cover_url_1_1":44,"title":45,"summary":46,"author":47},557,"2022-04-29","article_res/cover/7a9b1375ed9bb298154981bae42b794d.jpeg","article_res/cover/afa281dd52bc0454e6735daa8e6b0706.jpeg","Translation and summary of Messari Report [2.8 Kristin Smith, Blockchain Association and Katie Haun, a16z]","We need unity and speed right now.","Translation",{"id":49,"publish_date":50,"is_original":4,"collection":5,"cover_url":51,"cover_url_1_1":52,"title":53,"summary":54,"author":47},531,"2022-05-25","article_res/cover/e8362057f8fa189594c60afdfaaeb6e5.jpeg","article_res/cover/8ea08d0d6fa7eee6b57ed4ec61b61ad6.jpeg","Decentralized Society: Finding Web3’s Soul / Decentralized Society: Finding the Soul of Web3 -7","Decentralization through Pluralism When analyzing ecosystems, it's desirable to measure how decentralized it is.",{"id":56,"publish_date":57,"is_original":32,"collection":58,"cover_url":59,"cover_url_1_1":60,"title":61,"summary":62,"author":63},127,"2024-11-14","#Google #AI Game #World Model #AI Story","article_res/cover/0233a875b7ec2debf59779e311547569.jpeg","article_res/cover/6ffddb6ae4914b3c699493311aa9f198.jpeg","Google Launches \"Unbounded\": A Generative Infinite Character Life Simulation Game","Unbounded: A Generative Infinite Game of Character Life Simulation","Renee's Entrepreneurial Journey",{"id":13,"publish_date":65,"is_original":32,"collection":66,"cover_url":67,"cover_url_1_1":68,"title":69,"summary":70,"author":63},"2025-02-14","#Deep Dive into LLMs #Andrej Karpathy #LLM #Tool Use #Hallucination","article_res/cover/11e858ad6b74dfa80f923d549b62855c.jpeg","article_res/cover/615e1b320f1fc163edc1d2d154a6de33.jpeg","Andrej Karpathy's in-depth explanation of LLM (Part 4): Hallucinations","hallucinations, tool use, knowledge/working memory",{"id":72,"publish_date":73,"is_original":4,"collection":5,"cover_url":74,"cover_url_1_1":75,"title":76,"summary":77,"author":47},579,"2022-04-07","article_res/cover/39387376ba28447af1eb40576b9df215.jpeg","article_res/cover/02727ede8551ed49901d0abe6d6305b7.jpeg","Messari Report Translation and Summary 【1-7 Surviving the Winter】","I’d be more cautious here: 10 year and 10 hour thinking only.",[79,87,95,103,111],{"id":80,"publish_date":81,"is_original":32,"collection":82,"cover_url":83,"cover_url_1_1":84,"title":85,"summary":86,"author":63},627,"2025-03-20","#AI Avatar #AI Video Generation","article_res/cover/d95481358f73924989f8c4ee9c75d1c8.jpeg","article_res/cover/b74bc0fab01f8b6a6aa87696c0c3ed8b.jpeg","DisPose: Generating Animated Videos by Driving Video with Reference Images","DisPose is a controllable human image animation method that enhances video generation.",{"id":88,"publish_date":89,"is_original":32,"collection":90,"cover_url":91,"cover_url_1_1":92,"title":93,"summary":94,"author":63},626,"2025-03-21","#Deep Dive into LLMs #LLM #RL #Andrej Karpathy #AlphaGo","article_res/cover/446553a5c8f8f2f07d97b20eaee84e56.jpeg","article_res/cover/e6c2823409c9b34624064b9acbaca6f1.jpeg","AlphaGo and the Power of Reinforcement Learning - Andrej Karpathy's Deep Dive on LLMs (Part 9)","Simply learning from humans will never surpass human capabilities.",{"id":96,"publish_date":97,"is_original":32,"collection":98,"cover_url":99,"cover_url_1_1":100,"title":101,"summary":102,"author":63},625,"2025-03-22","#Deep Dive into LLMs #LLM #RL #RLHF #Andrej Karpathy","article_res/cover/8da81d38b1e5cf558a164710fd8a5389.jpeg","article_res/cover/96f028d76c362a99a0dd56389e8f7a9b.jpeg","Reinforcement Learning from Human Feedback (RLHF) - Andrej Karpathy's Deep Dive on LLMs (Part 10)","Fine-Tuning Language Models from Human Preferences",{"id":104,"publish_date":105,"is_original":32,"collection":106,"cover_url":107,"cover_url_1_1":108,"title":109,"summary":110,"author":63},624,"2025-03-23","#Deep Dive into LLMs #LLM #Andrej Karpathy #AI Agent #MMM","article_res/cover/a5e7c3d48bb09109684d6513287c661d.jpeg","article_res/cover/d3f22b7c0ab8d82fd2da457a299e0773.jpeg","The Future of Large Language Models - Andrej Karpathy's In-Depth Explanation of LLM (Part 11)","preview of things to come",{"id":112,"publish_date":105,"is_original":32,"collection":113,"cover_url":114,"cover_url_1_1":115,"title":116,"summary":117,"author":63},623,"#Google #Voe #AI Video Generation","article_res/cover/c44062fea0f336c2b96b3928292392c2.jpeg","article_res/cover/a041041c69092ad3db191c5bf3ff981b.jpeg","Trial of Google's video generation model VOE2","Our state-of-the-art video generation model",[119,127,135],{"id":120,"publish_date":121,"is_original":32,"collection":122,"cover_url":123,"cover_url_1_1":124,"title":125,"summary":126,"author":63},160,"2024-10-04","#Philosophy","article_res/cover/496990c49211e8b7f996b7d39c18168e.jpeg","article_res/cover/14dbaa1ade9cb4316d5829423a900362.jpeg","Time","The fungus of the morning does not know the waxing and waning of the moon, and the cicada does not know the seasons; this is a short life. To the south of the state of Chu there is a dark spirit which regards five hundred years as spring and five hundred years as autumn. In ancient times there was a great tree called the Ming which regarded eight thousand years as spring and eight thousand years as autumn; this is a long life.",{"id":128,"publish_date":129,"is_original":32,"collection":130,"cover_url":131,"cover_url_1_1":132,"title":133,"summary":134,"author":63},98,"2024-12-17","#AI Video Generator #Sora #Pika","article_res/cover/3b86e85d03fff4f356a3e4cf2bb329c9.jpeg","article_res/cover/5fa5c20ad0b40f8f544d257c0ef02938.jpeg","Pika 2.0 video generation officially released: effect comparison with Sora","今天，我们推出了Pika 2.0模型。卓越的文字对齐效果。惊人的视觉表现。还有✨场景成分✨",{"id":136,"publish_date":137,"is_original":32,"collection":138,"cover_url":139,"cover_url_1_1":140,"title":141,"summary":142,"author":63},71,"2025-01-14","#Nvidia #World Foundation Model #Cosmos #Physical AI #Embodied AI","article_res/cover/feddf8c832dfb45d28804291f6a42a9e.jpeg","article_res/cover/d6bc2f1186d96b78228c2283a17a3645.jpeg","NVIDIA's Cosmos World Model","Cosmos World Foundation Model Platform for Physical AI",[144,163,188],{"title":8,"items":145},[146,147,155],{"id":104,"publish_date":105,"is_original":32,"collection":106,"cover_url":107,"cover_url_1_1":108,"title":109,"summary":110,"author":63},{"id":148,"publish_date":149,"is_original":32,"collection":150,"cover_url":151,"cover_url_1_1":152,"title":153,"summary":154,"author":63},622,"2025-03-24","#OWL #AI Agent #MAS #MCP #CUA","article_res/cover/cb50ca7f2bf4d1ed50202d7406e1c19a.jpeg","article_res/cover/4aa7aa3badfacf3cc84121334f1050dd.jpeg","OWL: Multi-agent collaboration","OWL: Optimized Workforce Learning for General Multi-Agent Assistance in Real-World Task Automation",{"id":156,"publish_date":157,"is_original":32,"collection":158,"cover_url":159,"cover_url_1_1":160,"title":161,"summary":162,"author":63},620,"2025-03-26","#LLM #Google #Gemini #AI Agent","article_res/cover/53751a6dbbe990b1eb0b63f3b062aed4.jpeg","article_res/cover/031344981f0a212ff82d1f3a64aa5756.jpeg","Gemini 2.5 Pro, claimed to be far ahead of the competition, has been released with great fanfare: comprehensively surpassing other LLMs and topping the global rankings","Gemini 2.5: Our most intelligent AI model",{"title":9,"items":164},[165,172,180],{"id":166,"publish_date":157,"is_original":32,"collection":167,"cover_url":168,"cover_url_1_1":169,"title":170,"summary":171,"author":63},619,"#OpenAI #AI Image Generator #4o #MMM #AR Transformer","article_res/cover/2faffc97fcecf3151552cb0fd3206d89.jpeg","article_res/cover/1133cb4948af44cee2e7fbe79efb69e5.jpeg","The native image function of GPT-4o is officially launched","Introducing 4o Image Generation",{"id":173,"publish_date":174,"is_original":4,"collection":175,"cover_url":176,"cover_url_1_1":177,"title":178,"summary":179,"author":63},434,"2023-07-15","#Anthropic #OpenAI #Google #AI Code Generator #Claude","article_res/cover/e1b6f600a2b9f262a4392684e5f2ce25.jpeg","article_res/cover/6e1772e83f78f9a351ab23d3e414adee.jpeg","Latest Updates on Google Bard /Anthropic Claude2 / ChatGPT Code Interpreter","We want our models to use their programming skills to provide more natural interfaces to the basic functions of our computers.  \n - OpenAI",{"id":181,"publish_date":182,"is_original":4,"collection":183,"cover_url":184,"cover_url_1_1":185,"title":186,"summary":187,"author":63},417,"2023-08-24","#OpenAI","article_res/cover/bccf897d50a88b18364e35f7466387e0.jpeg","article_res/cover/2f871085c1073717c1703ae86e18056f.jpeg","The GPT-3.5 Turbo fine-tuning (fine-tuning function) has been released～","Developers can now bring their own data to customize GPT-3.5 Turbo for their use cases.",{"title":10,"items":189},[190,191,192],{"id":88,"publish_date":89,"is_original":32,"collection":90,"cover_url":91,"cover_url_1_1":92,"title":93,"summary":94,"author":63},{"id":96,"publish_date":97,"is_original":32,"collection":98,"cover_url":99,"cover_url_1_1":100,"title":101,"summary":102,"author":63},{"id":104,"publish_date":105,"is_original":32,"collection":106,"cover_url":107,"cover_url_1_1":108,"title":109,"summary":110,"author":63},true,{"code":4,"msg":5,"data":195},{"id":196,"publish_date":197,"is_original":32,"collection":198,"articles_id":199,"cover_url":200,"cover_url_1_1":201,"title":202,"summary":203,"author":63,"content":204},56,"2025-01-28","#DeepSeek #MMM #AI Image Generator #Multimodal Understanding","KLzGyOtuEGzUlZk_0MhceQ","article_res/cover/3b7efa559e0888141c890e933d504510.jpeg","article_res/cover/653880885ccb19b1ba9496f2aaee4a38.jpeg","DeepSeek Janus Series: Unified Multimodal Understanding and Generation Models","Janus-Series: Unified Multimodal Understanding and Generation Models","\u003Cdiv class=\"rich_media_content js_underline_content\n                       autoTypeSetting24psection\n            \" id=\"js_content\">\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Details see ⬇️), compared with other closed-source or open-source solutions currently available, there is still a gap.\u003C/p>\u003Ch2 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Brief introduction of three models\u003C/span>\u003C/h2>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Janus: Visual decoupling for multimodal understanding and generation\u003C/strong>\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>is an innovative autoregressive framework dedicated to unifying multimodal understanding and generation. Its unique advantages include:\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: By separating the visual encoding path, it effectively alleviates the conflict between generation and understanding while still adopting a unified Transformer architecture.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: The decoupled design enhances the flexibility of the framework, enabling it to surpass traditional unified models in multimodal tasks and match task-specific models.\u003C/section>\u003C/li>\u003C/ul>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Technical features\u003C/strong>：\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">Simplified design: Reducing architectural complexity.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">Efficiency: Performs excellently across multiple tasks, becoming a strong candidate for next-generation multimodal models.\u003C/section>\u003C/li>\u003C/ul>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Model download links and paper addresses\u003C/strong>：\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Janus-1.3B\u003C/strong>：https://huggingface.co/deepseek-ai/Janus-1.3B\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Paper\u003C/strong>：https://arxiv.org/abs/2410.13848\u003C/section>\u003C/li>\u003C/ul>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009656\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771575520.5161596799911616.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">\u003Cbr>\u003C/strong>\u003C/span>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">JanusFlow: Harmonious unification of autoregression and correction flow\u003C/strong>\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>. Its technical highlights include:\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: No complex architectural modifications required, allowing direct training of rectified flows within the large language model framework.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: Achieves performance comparable or even superior to specialized models in visual and language tasks.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: Significantly surpasses the benchmark performance of existing unified methods.\u003C/section>\u003C/li>\u003C/ul>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Model download links and paper addresses\u003C/strong>：\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">JanusFlow-1.3B\u003C/strong>：https://huggingface.co/deepseek-ai/JanusFlow-1.3B\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Paper\u003C/strong>：https://arxiv.org/abs/2411.07975\u003C/section>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection data-mid=\"\" mpa-from-tpl=\"t\" style=\"display: flex;justify-content: center;align-items: center;width: 578px;\">\u003Csection data-mid=\"\" mpa-from-tpl=\"t\" style=\"display: flex;justify-content: center;align-items: center;width: 578px;\">\u003Csection data-mid=\"\" mpa-from-tpl=\"t\" style=\"width: 259px;display: flex;justify-content: center;align-items: center;\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009669\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771575530.8297392605139877.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">\u003Cbr>\u003C/strong>\u003C/span>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">\u003C/strong>\u003C/span>\u003C/h3>\u003C/li>\u003C/ul>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Janus-Pro: A multimodal unified framework for data and model expansion\u003C/strong>\u003C/span>\u003C/h3>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-2\">\u003Cli>\u003Cp style='margin-bottom: 0px;padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>is the advanced version of Janus, integrating the following core improvements:\u003C/p>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: Significantly improves generalization ability and stability.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: Enriches the scenarios and diversity of multimodal inputs.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: Enhances the model's understanding ability and generation stability.\u003C/section>\u003C/li>\u003Cli>\u003Cp style='margin-bottom: 0px;padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>These improvements enable Janus-Pro to excel in the following areas:\u003C/p>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: Achieves industry-leading performance in complex tasks.\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">: More efficiently executes complex instructions and generates stable high-quality images.\u003C/section>\u003C/li>\u003Cli>\u003Cp style='margin-bottom: 0px;padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Model download links and paper addresses\u003C/strong>：\u003C/p>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Janus-Pro-1B\u003C/strong>：https://huggingface.co/deepseek-ai/Janus-Pro-1B\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Janus-Pro-7B\u003C/strong>：https://huggingface.co/deepseek-ai/Janus-Pro-7B\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cstrong style=\";color: rgb(0, 0, 0);background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Paper\u003C/strong>：https://github.com/deepseek-ai/Janus/blob/main/janus_pro_tech_report.pdf\u003C/section>\u003C/li>\u003C/ul>\u003Ch2 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 289px;height: 13px;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009658\" data-ratio=\"0.04498269896193772\" data-w=\"578\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771575680.08012381418682057.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cbr>\u003C/span>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Trial picture understanding\u003C/span>\u003C/h2>\u003Csection style=';margin-top: 20px;margin-bottom: 20px;padding: 10px 20px;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgb(250, 250, 250);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection style=\";\">\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">: Understand this meme\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009615\" data-ratio=\"0.75\" data-s=\"300,640\" data-type=\"jpeg\" data-w=\"800\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771575720.700947141594765.jpeg\">\u003C/p>\u003C/section>\u003C/section>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">This meme humorously contrasts two types of AI assistants: on the left is someone using an OpenAI assistant, with a camera mounted on their head, implying surveillance bias; on the right is someone using a DeepSee assistant, holding a camera pointed at the screen, implying focus on visual analysis and data collection. The meme satirizes OpenAI's emphasis on surveillance, while DeepSee focuses more on visual analysis.\u003C/p>\u003C/blockquote>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">OpenAI and DeepSeek are two AI models designed to help humans complete various tasks. OpenAI is a general-purpose large language model that can be used for chatbots, language translation, and text generation in a wide range of applications; whereas DeepSeek focuses more on searching for specific information from large datasets. These two models each have their own strengths and weaknesses, often combined to accomplish more complex tasks.\u003C/p>\u003C/blockquote>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">This image is a humorous meme that visually metaphorically compares the two AI systems \"OpenAI\" and \"DeepSeek\".\u003C/p>\u003Col style=\"margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;\" class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Left Side (OpenAI):\u003C/strong>\u003C/p>\u003C/section>\u003C/li>\u003C/ol>\u003Cul style=\"margin-top: 8px;margin-bottom: 8px;;list-style-type: disc;padding-left: 25px;color: rgb(0, 0, 0);\" class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">A person wearing a helmet with multiple cameras or lenses, aiming at a target. This symbolizes OpenAI being more \"focused\" or \"precise\" on specific tasks or goals.\u003C/section>\u003C/li>\u003C/ul>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Right Side (DeepSeek) Overall Meaning:\u003C/strong>\u003C/p>\u003C/section>\u003C/li>\u003Cul style=\"margin-top: 8px;margin-bottom: 8px;;list-style-type: disc;padding-left: 25px;color: rgb(0, 0, 0);\" class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">A person aiming with a gun, but the target is invisible. This implies that although DeepSeek \"aims\" in a certain direction, the specific goal or result is not yet clear.\u003C/section>\u003C/li>\u003C/ul>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Overall Meaning:\u003C/strong>\u003C/p>\u003C/section>\u003C/li>\u003Cul style=\"margin-top: 8px;margin-bottom: 8px;;list-style-type: disc;padding-left: 25px;color: rgb(0, 0, 0);\" class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">The meme humorously implies that OpenAI is more focused and directed towards a specific goal, while DeepSeek is more ambiguous or uncertain in its approach.\u003C/section>\u003C/li>\u003C/ul>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">The meme humorously suggests that OpenAI is more clearly oriented towards specific goals, while DeepSeek's approach appears more ambiguous or uncertain.\u003C/p>\u003C/blockquote>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009620\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771575690.4193783606009829.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Csection style=';margin-top: 20px;margin-bottom: 20px;padding: 10px 20px;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgb(250, 250, 250);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection style=\";\">\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">: Spot the differences\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009621\" data-ratio=\"0.49444444444444446\" data-s=\"300,640\" data-type=\"webp\" data-w=\"1080\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771588090.6392049591610507.jpeg\">\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Correct answer\u003C/strong>：\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009622\" data-ratio=\"0.4925925925925926\" data-s=\"300,640\" data-type=\"webp\" data-w=\"1080\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771578590.12550823092905272.jpeg\">\u003C/p>\u003C/section>\u003C/section>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">No difference (Incorrect ❌)\u003C/p>\u003C/blockquote>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">The image shows a girl sitting at a table with a bowl of ice cream and an ice cream cone. There are two differences between the two images / This picture shows a girl sitting at a table with a bowl of ice cream and an ice cream cone. The differences between the two pictures are as follows:\u003C/p>\u003Col style=\"margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;\" class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">The girl's hair is slightly different in the second image. In the first image, her hair is straight, while in the second image, her hair is slightly wavy. The girl's hair in the second picture is slightly different. In the first picture, her hair is straight, while in the second picture, her hair is slightly wavy. (Incorrect ❌)\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">The girl is wearing a pink bow in her hair in the second image, which is not present in the first image. In the second picture, the girl is wearing a pink bow in her hair, which is not present in the first picture. (Correct ✅)\u003C/p>\u003C/section>\u003C/li>\u003C/ol>\u003C/blockquote>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">The difference between the two images is the color of the ice cream scoops. In the left image, the ice cream scoops are green, while in the right image, they are white. The difference between the two pictures is the color of the ice cream scoops. In the left picture, the ice cream scoops are green, while in the right picture, the ice cream scoops are white. (Incorrect ❌)\u003C/p>\u003C/blockquote>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009623\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577040.3161764865669583.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Csection style=';margin-top: 20px;margin-bottom: 20px;padding: 10px 20px;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgb(250, 250, 250);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection style=\";\">\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">: Solve math problems\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009624\" data-ratio=\"0.09351851851851851\" data-s=\"300,640\" data-type=\"webp\" data-w=\"1080\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771578800.7687437849825127.jpeg\">\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">: chatGPT o1 and o1 Pro can solve it.\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009625\" data-ratio=\"0.18633540372670807\" data-s=\"300,640\" data-type=\"webp\" data-w=\"644\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771579660.9821932096731578.jpeg\">\u003C/p>\u003C/section>\u003C/section>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp>To solve this math problem, we need to determine the value of ( h_1(-2) ).\u003C/p>\u003Cp>First, we need to find the value of ( h_1(-2) ).\u003C/p>\u003Cp>Given the equation:\u003C/p>\u003Cp>[ A^2 = 24^2 - 48t + 2, h_1(-2) = -9 ]\u003C/p>\u003Cp>We can substitute ( A^2 = 24^2 - 48t ) into the given equation:\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>Next, we need to solve the equation for ( t ):\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t + 2 = -9 ]\u003C/p>\u003Cp>[ 24^2 - 48t +\u003C/p>\u003C/blockquote>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Incorrect ❌)\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection>To solve this math problem, we need to find the value of ( h_1(1) ) given that ( h_1'(1) = 24t^2 - 48t + 2 ) and ( h_1(-2) = -9 ).\u003C/section>\u003Csection>First, we need to find ( h_1'(1) ):\u003C/section>\u003Csection>[ h_1'(1) = 24t^2 - 48t + 2 ]\u003C/section>\u003Csection>Next, we need to find ( h_1(-2) ):\u003C/section>\u003Csection>[ h_1(-2) = -9 ]\u003C/section>\u003Csection>Now, we can substitute ( h_1'(1) ) and ( h_1(-2) ) into the equation:\u003C/section>\u003Csection>[ 24t^2 - 48t + 2 - 9 = -9 ]\u003C/section>\u003Csection>Simplifying the equation:\u003C/section>\u003Csection>[ 24t^2 - 47t + 1 = -9 ]\u003C/section>\u003Csection>Subtract 1 from both sides:\u003C/section>\u003Csection>[ 24t^2 - 47t + 1 - 1 = -9 ]\u003C/section>\u003Csection>[ 24t^2 - 46t + 2 = -9 ]\u003C/section>\u003Csection>Now, we need to find ( h_1(1) ):\u003C/section>\u003Csection>[ h_1(1) = 24t^2 - 46t + 2 ]\u003C/section>\u003Csection>Substitute ( h_1(1) ) back into the equation we derived:\u003C/section>\u003Csection>[ 24t^2 - 46t + 2 = 24t^2 - 46t + 2 ]\u003C/section>\u003Csection>Simplify the equation:\u003C/section>\u003Csection>[ 24t^2 - 46t + 2 = 24t^2 - 46t + 2 ]\u003C/section>\u003Csection>This equation is already correct, so there is no need to solve it further. The solution is:\u003C/section>\u003Csection>[ h_1(1) = 24t^2 - 46t + 2 ]\u003C/section>\u003C/blockquote>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Incorrect ❌)\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection>We have the equation (24^2 - 48t + 2h = -9) and (h(-2) = -4).\u003C/section>\u003Csection>First, let's simplify the equation (24^2 - 48t + 2h = -9).\u003C/section>\u003Csection>[\u003C/section>\u003Csection>\\begin{align*}\u003C/section>\u003Csection>24^2 - 48t + 2h &amp;= -9 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h &amp;= -9 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48t + 2h + 9 &amp;= 0 \\\u003C/section>\u003Csection>(24)^2 - 48\u003C/section>\u003C/blockquote>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Incorrect ❌)\u003C/p>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009653\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577730.8934289628271963.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Csection style=';margin-top: 20px;margin-bottom: 20px;padding: 10px 20px;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgb(250, 250, 250);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection style=\";\">\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">: Count how many cars are in the parking lot\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009650\" data-ratio=\"0.6647940074906367\" data-s=\"300,640\" data-type=\"png\" data-w=\"1068\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771624230.7700264836646737.png\">\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">: 18 cars (this was previously given an incorrect answer by ChatGPT o1 Pro)\u003C/p>\u003C/section>\u003C/section>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">There are 20 cars in this parking lot.\u003C/p>\u003C/blockquote>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Incorrect ❌)\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">There are 21 cars in the parking lot.\u003C/p>\u003C/blockquote>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Incorrect ❌)\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cblockquote style='margin-top: 20px;margin-bottom: 20px;;padding: 10px 10px 10px 20px;border-top: 3px none rgba(0, 0, 0, 0.4);border-right: 3px none rgba(0, 0, 0, 0.4);border-bottom: 3px none rgba(0, 0, 0, 0.4);border-left-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0.05);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;overflow: auto;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">There are 14 cars in the parking lot.\u003C/p>\u003C/blockquote>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>(Incorrect ❌)\u003C/p>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 289px;height: 13px;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009627\" data-ratio=\"0.04498269896193772\" data-w=\"578\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771575790.5471699414352476.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Ch2 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Trial image generation\u003C/span>\u003C/h2>\u003Csection style=';margin-top: 20px;margin-bottom: 20px;padding: 10px 20px;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgb(250, 250, 250);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection style=\";\">\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Prompt/keywords\u003C/strong>：\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">The image features an intricately designed eye set against a circular backdrop adorned with ornate swirl patterns that evoke both realism and surrealism. At the center of attention is a strikingly vivid blue iris surrounded by delicate veins radiating outward from the pupil to create depth and intensity. The eyelashes are long and dark, casting subtle shadows on the skin around them which appears smooth yet slightly textured as if aged or weathered over time.\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Above the eye, there's a stone-like structure resembling part of classical architecture, adding layers of mystery and timeless elegance to the composition. This architectural element contrasts sharply but harmoniously with the organic curves surrounding it. Below the eye lies another decorative motif reminiscent of baroque artistry, further enhancing the overall sense of eternity encapsulated within each meticulously crafted detail.\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Overall, the atmosphere exudes a mysterious aura intertwined seamlessly with elements suggesting timelessness, achieved through the juxtaposition of realistic textures and surreal artistic flourishes. Each component—from the intricate designs framing the eye to the ancient-looking stone piece above—contributes uniquely towards creating a visually captivating tableau imbued with enigmatic allure.\u003C/p>\u003C/section>\u003C/section>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009629\" data-ratio=\"0.7252252252252253\" data-s=\"300,640\" data-type=\"png\" data-w=\"444\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771578530.8665595785907654.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cp style=\"text-align: left;\">Demo has bugs and did not run successfully\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009631\" data-ratio=\"0.7194570135746606\" data-s=\"300,640\" data-type=\"png\" data-w=\"442\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771588780.9924340921616881.png\">\u003C/p>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009628\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577040.23909093995274389.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Csection style=';margin-top: 20px;margin-bottom: 20px;padding: 10px 20px;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;background: none 0% 0% / auto no-repeat scroll padding-box border-box rgb(250, 250, 250);width: auto;height: auto;box-shadow: rgba(0, 0, 0, 0) 0px 0px 0px 0px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;'>\u003Csection style=\";\">\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">\u003Cstrong style=\";background: none 0% 0% / auto no-repeat scroll padding-box border-box rgba(0, 0, 0, 0);width: auto;height: auto;border-style: none;border-width: 3px;border-color: rgba(0, 0, 0, 0.4);border-radius: 0px;\">Prompt/keywords\u003C/strong>：\u003C/p>\u003Cp style=\";line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">A charming, cute girl with big sparkling eyes, soft pastel-colored hair (e.g., pink, lavender, or mint green), wearing a stylish outfit with subtle frills and bows, standing in a dreamy background filled with soft lighting, cherry blossoms, and gentle gradients. The atmosphere is cheerful and heartwarming, with warm, glowing highlights and delicate details in the surroundings, anime-inspired style\u003C/p>\u003C/section>\u003C/section>\u003Ch2 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003C/h2>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009633\" data-ratio=\"0.7228381374722838\" data-s=\"300,640\" data-type=\"png\" data-w=\"451\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771578240.642267228625307.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cp>Demo has bugs and did not run successfully\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009632\" data-ratio=\"0.7070484581497798\" data-s=\"300,640\" data-type=\"png\" data-w=\"454\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577030.27409539404208516.png\">\u003C/p>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 289px;height: 13px;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009660\" data-ratio=\"0.04498269896193772\" data-w=\"578\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771591020.36191558246725464.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cp>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cspan style='letter-spacing: 0em;font-family: mp-quote, \"PingFang SC\", system-ui, -apple-system, BlinkMacSystemFont, \"Helvetica Neue\", \"Hiragino Sans GB\", \"Microsoft YaHei UI\", \"Microsoft YaHei\", Arial, sans-serif;'>More image examples\u003C/span>\u003C/span>\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009640\" data-ratio=\"0.6155268022181146\" data-s=\"300,640\" data-type=\"png\" data-w=\"541\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771592960.05764903548898692.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cspan style='color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 20px;font-weight: 700;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>JanusFlow\u003C/span>\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009646\" data-ratio=\"0.6297872340425532\" data-s=\"300,640\" data-type=\"png\" data-w=\"470\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771589420.4821377669013991.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cspan style='color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 20px;font-weight: 700;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style='color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 20px;font-weight: 700;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>Janus-Pro\u003C/span>\u003C/span>\u003C/span>\u003C/h3>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009638\" data-ratio=\"0.6888888888888889\" data-s=\"300,640\" data-type=\"png\" data-w=\"1080\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577460.2646837087356111.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 289px;height: 13px;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009637\" data-ratio=\"0.04498269896193772\" data-w=\"578\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577200.399295574050339.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cspan style='color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 20px;font-weight: 700;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style='color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 20px;font-weight: 700;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cbr>\u003C/span>\u003C/span>\u003C/span>\u003C/h3>\u003Cp>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Evaluation\u003C/span>\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Benchmark test performance\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009639\" data-ratio=\"0.8574712643678161\" data-s=\"300,640\" data-type=\"png\" data-w=\"435\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577340.5059536521060608.png\">\u003C/p>\u003Cp style=\"text-align: left;\">\u003Cspan style='color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>Visual generation results\u003C/span>\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009644\" data-ratio=\"0.5784008307372793\" data-s=\"300,640\" data-type=\"png\" data-w=\"963\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771579780.8536840646860091.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg data-imgfileid=\"100009665\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577350.3613772072082295.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cbr>\u003C/span>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Benchmark test performance\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009645\" data-ratio=\"0.8465608465608465\" data-s=\"300,640\" data-type=\"png\" data-w=\"378\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771577420.9967039982485209.png\">\u003C/p>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Visual generation results\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009649\" data-ratio=\"0.7797619047619048\" data-s=\"300,640\" data-type=\"png\" data-w=\"840\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771596220.35874826918037717.png\">\u003C/p>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009664\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771580900.5631038754601672.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cbr>\u003C/span>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Average performance in four multimodal understanding benchmark tests\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009634\" data-ratio=\"0.7970204841713222\" data-s=\"300,640\" data-type=\"png\" data-w=\"537\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771589010.8301582634382545.png\">\u003C/p>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Performance in text-to-image generation instruction-following benchmark tests\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009635\" data-ratio=\"0.769090909090909\" data-s=\"300,640\" data-type=\"png\" data-w=\"550\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771588890.08363749385359509.png\">\u003C/p>\u003Ch2 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 289px;height: 13px;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009661\" data-ratio=\"0.04498269896193772\" data-w=\"578\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771578730.6530312722805418.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cbr>\u003C/span>\u003Cspan style=\";font-size: 22px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Technical framework\u003C/span>\u003C/h2>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Unlike previous methods that typically assume visual understanding and generation share the same visual encoder, Janus decouples visual encoding into independent modules for visual understanding and visual generation. “Und. Encoder” and “Gen. Encoder” are the abbreviations for “Understanding Encoder” and “Generation Encoder,” respectively.\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009641\" data-ratio=\"0.4323189926547744\" data-s=\"300,640\" data-type=\"png\" data-w=\"953\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771579130.8457446012735161.png\">\u003C/p>\u003Cp style='margin-bottom: 0px;padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Janus' three-stage training process:\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009643\" data-ratio=\"0.2930135557872784\" data-s=\"300,640\" data-type=\"png\" data-w=\"959\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771579080.9492266043040429.png\">\u003C/p>\u003Cul class=\"list-paddingleft-1\" style='margin-top: 8px;margin-bottom: 8px;padding-left: 25px;width: 577.422px;;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\"padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;\">The goal is to establish conceptual connections between vision and language in the embedding space, enabling the model with preliminary visual generation capabilities. The visual encoder and LLM are frozen during this stage, updating only the understanding adapter, generation adapter, and image head.\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\"padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;\">Trained using multimodal corpora to allow the model to learn multimodal understanding and generation capabilities. The LLM is unfrozen, utilizing pure text data, multimodal understanding data, and visual generation data for training, starting visual generation training from ImageNet-1k and then expanding to open-domain text-to-image data.\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\"margin-top: 5px;margin-bottom: 5px;;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\"padding-top: 8px;padding-bottom: 8px;;color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;\">Fine-tune the model using instruction tuning data to enhance its instruction-following and conversational abilities. All parameters except the generation encoder are unfrozen. The data mix includes pure text dialogues, multimodal understanding, and visual generation, ensuring the model's versatility across various scenarios.\u003C/p>\u003C/section>\u003C/li>\u003C/ul>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg data-imgfileid=\"100009662\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771579030.8432833794507857.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cbr>\u003C/span>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">JanusFlow\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>JanusFlow architecture: In visual understanding, the LLM generates responses through autoregressive prediction; in image generation, starting from Gaussian noise (𝑡=0), the LLM iteratively updates 𝑧𝑡 by predicting velocity vectors until 𝑡=1. For simplification, the VAE encoder, skip connections in generation, and the linear layer after 𝑓𝑒𝑛𝑐 are omitted.\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009647\" data-ratio=\"0.37187127532777114\" data-s=\"300,640\" data-type=\"png\" data-w=\"839\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771579040.7738752165781326.png\">\u003C/p>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>JanusFlow’s three-stage training process:\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009648\" data-ratio=\"0.34491315136476425\" data-s=\"300,640\" data-type=\"png\" data-w=\"806\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771588740.8010329001501253.png\">\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Train randomly initialized linear layers, generation encoders, and generation decoders so these new modules work together with the pretrained LLM and SigLIP encoder, completing initialization.\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Train the entire model (visual encoder frozen) using three types of data: multimodal understanding data, image generation data, and pure text data. Initially focus on multimodal understanding, later increasing the proportion of image generation data to adapt to diffusion model convergence needs.\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Fine-tune the model using instruction tuning data (including dialogues, task-specific dialogues, high-quality text-to-image generation examples). Unfreeze the SigLIP encoder to improve instruction response capabilities for multimodal understanding and image generation tasks.\u003C/p>\u003C/section>\u003C/li>\u003C/ul>\u003Ch3 style='margin-top: 30px;margin-bottom: 15px;color: rgba(0, 0, 0, 0.85);;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);'>\u003Csection data-mpa-template=\"t\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"display: flex;justify-content: center;align-items: center;width: 100%;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Csection style=\"width: 259px;display: flex;justify-content: center;align-items: center;\" data-mid=\"\" mpa-from-tpl=\"t\">\u003Cimg class=\"rich_pages wxw-img\" data-imgfileid=\"100009663\" data-ratio=\"0.02702702702702703\" data-w=\"518\" style=\"display: block;height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771580360.8984615541913361.png\">\u003C/section>\u003C/section>\u003C/section>\u003C/section>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">\u003Cbr>\u003C/span>\u003Cspan style=\";font-size: 20px;color: rgb(0, 0, 0);line-height: 1.5em;letter-spacing: 0em;font-weight: bold;display: block;\">Janus-Pro\u003C/span>\u003C/h3>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Janus-Pro architecture: Decouple visual encoding into multimodal understanding (Understanding Encoder, abbreviated as “Und. Encoder”) and visual generation (Generation Encoder, abbreviated as “Gen. Encoder”).\u003C/p>\u003Cp style=\"text-align: center;\">\u003Cimg class=\"rich_pages wxw-img js_insertlocalimg\" data-imgfileid=\"100009642\" data-ratio=\"0.4361111111111111\" data-s=\"300,640\" data-type=\"png\" data-w=\"1080\" style=\"height: auto !important;\" src=\"https://res.cooltool.vip/article_res/assets/17423771593360.7998762869409697.png\">\u003C/p>\u003Cp style='margin-bottom: 0px;;color: rgb(0, 0, 0);font-size: 16px;line-height: 1.8em;letter-spacing: normal;text-align: left;padding-top: 8px;padding-bottom: 8px;font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;background-color: rgb(255, 255, 255);'>Compared to Janus, Pro optimizes the three stages as follows:\u003C/p>\u003Cul style='margin-top: 8px;margin-bottom: 8px;;padding-left: 25px;color: rgb(0, 0, 0);font-family: Optima, \"Microsoft YaHei\", PingFangSC-regular, serif;font-size: 16px;letter-spacing: normal;text-align: left;background-color: rgb(255, 255, 255);' class=\"list-paddingleft-1\">\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Increase the number of training steps in Stage I for sufficient training on the ImageNet dataset. Even with fixed LLM parameters, the model can effectively model pixel dependencies and generate reasonable images based on class names.\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">Remove ImageNet data and directly use ordinary text-to-image data to generate images based on dense descriptions, thereby improving training efficiency and overall performance.\u003C/p>\u003C/section>\u003C/li>\u003Cli style=\";\">\u003Csection style=\";margin-top: 5px;margin-bottom: 5px;color: rgb(1, 1, 1);line-height: 1.8em;letter-spacing: 0em;\">\u003Cp style=\";color: rgb(0, 0, 0);line-height: 1.8em;letter-spacing: 0em;text-indent: 0em;padding-top: 8px;padding-bottom: 8px;\">In supervised fine-tuning, adjust the ratio of multimodal data, pure text data, and text-to-image data from 7:3:10 to 5:1:4. Slightly reducing the proportion of text-to-image data allows the model to maintain strong visual generation capabilities while improving multimodal understanding performance.\u003C/p>\u003C/section>\u003C/li>\u003C/ul>\u003Cp style=\"display: none;\">\u003Cmp-style-type data-value=\"3\">\u003C/mp-style-type>\u003C/p>\u003C/div>",1752585425162]