Abstract
This paper provides a survey of the state of the art of hybrid languagemodels architectures and strategies for "complex" question-answering (QA, CQA,CPS). Very large language models are good at leveraging public data on standardproblems but once you want to tackle more specific complex questions orproblems you may need specific architecture, knowledge, skills, tasks, methods,sensitive data, performance, human approval and versatile feedback... Thissurvey extends findings from the robust community edited research papers BIG,BLOOM and HELM which open source, benchmark and analyze limits and challengesof large language models in terms of tasks complexity and strict evaluation onaccuracy (e.g. fairness, robustness, toxicity, ...). It identifies the keyelements used with Large Language Models (LLM) to solve complex questions orproblems. Recent projects like ChatGPT and GALACTICA have allowednon-specialists to grasp the great potential as well as the equally stronglimitations of language models in complex QA. Hybridizing these models withdifferent components could allow to overcome these different limits and go muchfurther. We discuss some challenges associated with complex QA, includingdomain adaptation, decomposition and efficient multi-step QA, long form QA,non-factoid QA, safety and multi-sensitivity data protection, multimodalsearch, hallucinations, QA explainability and truthfulness, time dimension.Therefore we review current solutions and promising strategies, using elementssuch as hybrid LLM architectures, human-in-the-loop reinforcement learning,prompting adaptation, neuro-symbolic and structured knowledge grounding,program synthesis, and others. We analyze existing solutions and provide anoverview of the current research and trends in the area of complex QA.