Note for AI

Design consideration \Route query to model\api gw What is a key consideration when scaling a GenAI app using large models? Always using the largest model can be inefficient and expensive. Using model routing allows you to balance cost, speed, and quality by selecting models based on query type routing simpler queries to smaller models and complex ones to larger LLMs. Which of the following is a common rule used to route a query to a smaller, faster model? ...

Design pattern Which SOLID principle states that a class should have only one reason to change? Single Responsibility Principle Which design pattern is demonstrated by Python’s new method? The new method can be used to implement the Singleton pattern by ensuring that only one instance of a class is created. 数据库连接池,线程池单例模式（Singleton Pattern）- 一个类只能有一个实例，并提供一个全局访问点。在 new 中判断是否已经创建过实例，如果有，就直接返回已有的实例 class Singleton: _instance = None # 用于保存唯一实例 def __new__(cls, *args, **kwargs): if cls._instance is None: cls._instance = super().__new__(cls) # __new__ 必须返回一个类的实例（通常是 super().__new__(cls)） # 如果你返回的不是该类的实例，__init__ 就不会被调用 return cls._instance def __init__(self, value): self.value = value a = Singleton(10) b = Singleton(20) print(a is b) # True，说明是同一个对象

data type Why are dictionaries faster than lists for key-based lookups? key-based lookups very fast Dictionaries in Python use a hash table internally, which allows them to directly compute the location of the value associated with a given key with an average complexity of O(1). Lists, on the other hand, require a sequential search (O(n)) to find an item. Which of the following statements is true regarding the performance of Pandas and NumPy? ...

From application perspective, how cloud vendor fullfils the requirements Azure What is the primary function of Azure AI Model Inference? Performs model inference for flagship models in Azure AI model catalog What role does Azure AI Search play in retrieval-augmented generation (RAG)? Acts as a vector database What is the purpose of a skillset in Azure Cognitive Search? A skillset in Azure Cognitive Search is used to enrich data by applying AI capabilities such as OCR, entity recognition, and text translation before indexing. To apply AI-powered transformations to enrich indexed data ...

LLM What is the main difference between the encoder and decoder in a transformer model? Encoders process input into context-rich representations, while decoders generate output tokens using the encoder’s output and previous tokens. Encoder Input: Takes in the full input sequence (e.g., a sentence). How: Uses self-attention to understand relationships between all input tokens. Output: A sequence of contextual embeddings(alignment vector) that capture the meaning of the input. Decoder Input: Takes the encoder’s output plus previously generated tokens. How: Uses cross-attention (on encoder output) and self-attention (on previous outputs). Output: A sequence of output tokens (e.g., translated sentence, summary, etc.). What is the difference between self-attention and cross-attention in Transformer models? ...