“. In general, the response matches that shape, but it’s not guaranteed. We need to be a little defensive here and validate our input. If it fails the validation, we output the results to an error collection. In this sample, we leave those values there. For a production pipeline, you might want to let the LLM try a second time and run the error collection results in RunInference again and then flatten the response with the results collection. Because Beam pipelines are Directed Acyclic Graphs, we can’t create a loop here.We now take the results collection and process the LLM output. To process the results of RunInference, we create a new DoFn SentimentAnalysis and function extract_model_reply This step returns an object of type PredictionResult:It’s worth spending a few minutes on the need for extract_model_reply(). Because the model is self-hosted, we cannot guarantee that the text will be a JSON output. To ensure that we get a JSON output, we need to run a couple of checks. One benefit of using the Gemini API is that it includes a feature that ensures the output is always JSON, known as constrained decoding.Let’s now use these functions in our pipeline:Using with_outputs creates multiple accessible collections in filtered_results. The main collection has sentiments and summaries for positive and neutral reviews, while error contains any unparsable responses from the LLM. You can send these collections to other sources, such as BigQuery, with a write transform. This example doesn’t demonstrate this step, however, the negative collection is something that we want to do more within this pipeline.Making sure customers are happy is critical for retention. While we have used a light-hearted example with our pineapple on pizza debate, the direct interactions with a customer should always strive for empathy and positive responses from all parts of an organization. At this stage, we pass on this chat to one of the trained support representatives, but we can still see if the LLM is able to assist that support person in reducing the time to resolution.For this step, we make a call to the model and ask it to formulate a response. We again use the Gemma 2B model for this call in the code.In general, you wrap the prompt creation code in a DoFn, but it is also possible to use a simple lambda in the pipeline code itself. Here we generate a prompt that contains the original chat message, which was extracted in the SentimentAnalysis function.For local running and testing we can make use of some simple print statements to see the outputs on the various PCollections:Of course for the real usage, these outputs will be sent to various sinks such as Pub/Sub and BigQuery.Lets see how the model does with the previous JSON message:Step 1: Sentiment analysis and summarization”sentiment”: -1,”summary”: “User 221 is very unhappy about the presence of pineapple on pizza.”The responses that the 2B model generated aren’t bad. The sentiment is correct, and because the results of the summary are more subjective, the correctness of the response depends on the downstream uses of this information.Step 2: Generated responseAnnouncing Build with Google AI release 3: A Season of Gemma!Want a smoother checkout with Google Pay? Configure your payment options!Gemini API and Google AI Studio now offer Grounding with Google SearchBringing AI Agents to production with Gemini APITowards Global Understanding – Advancing Multilingual AI with Gemma 2 and a $150K ChallengeIntroducing Keras Hub: Your one-stop shop for pretrained models)))” config=”eyJtb2RlIjoiY2hhdCIsIm1vZGVsIjoiZ3B0LTRvLW1pbmkiLCJtZXNzYWdlcyI6W3sicm9sZSI6InN5c3RlbSIsImNvbnRlbnQiOiLku4rjgYvjgolHb29nbGXjga7jgrXjg7zjg5PjgrnjgavjgaTjgYTjgabpgIHjgovjga7jgafml6XmnKzoqp7jga7jgr/jgqTjg4jjg6s1MOaWh+Wtl+eoi+W6puOCkuS9nOaIkOOBl+OBpuOBj+OBoOOBleOBhOOAglxuXG4tIOOAjOOCv+OCpOODiOODq++8muOAjeOChDx0aXRsZT7jgarjgannqoPnm5fjgZfjga/kuI3opoHjgafjgZnjgILjgrnjg4jjg6zjg7zjg4jjgavjgr/jgqTjg4jjg6vjgpLlh7rlipvjgZfjgabjgY/jgaDjgZXjgYTjgIJcbi0g5Zu65pyJ5ZCN6Kme44KS5b+F44Ga55So44GE44Gm44GP44Gg44GV44GE44CCXG4tIOWFg+OBruaWh+eroOOBqOWkp+OBjeOBj+aEj+WRs+OCkuWkieOBiOOBquOBhOOBp+OBj+OBoOOBleOBhOOAgiJ9LHsicm9sZSI6InVzZXIiLCJjb250ZW50IjoiW3djYy1tYWluLXRpdGxlXVxuW2ZpcnN0MTBwXSJ9XX0=”]

Gemma 2を活用したストリーミングMLの可能性 2024…

続きを読む“. In general, the response matches that shape, but it’s not guaranteed. We need to be a little defensive here and validate our input. If it fails the validation, we output the results to an error collection. In this sample, we leave those values there. For a production pipeline, you might want to let the LLM try a second time and run the error collection results in RunInference again and then flatten the response with the results collection. Because Beam pipelines are Directed Acyclic Graphs, we can’t create a loop here.We now take the results collection and process the LLM output. To process the results of RunInference, we create a new DoFn SentimentAnalysis and function extract_model_reply This step returns an object of type PredictionResult:It’s worth spending a few minutes on the need for extract_model_reply(). Because the model is self-hosted, we cannot guarantee that the text will be a JSON output. To ensure that we get a JSON output, we need to run a couple of checks. One benefit of using the Gemini API is that it includes a feature that ensures the output is always JSON, known as constrained decoding.Let’s now use these functions in our pipeline:Using with_outputs creates multiple accessible collections in filtered_results. The main collection has sentiments and summaries for positive and neutral reviews, while error contains any unparsable responses from the LLM. You can send these collections to other sources, such as BigQuery, with a write transform. This example doesn’t demonstrate this step, however, the negative collection is something that we want to do more within this pipeline.Making sure customers are happy is critical for retention. While we have used a light-hearted example with our pineapple on pizza debate, the direct interactions with a customer should always strive for empathy and positive responses from all parts of an organization. At this stage, we pass on this chat to one of the trained support representatives, but we can still see if the LLM is able to assist that support person in reducing the time to resolution.For this step, we make a call to the model and ask it to formulate a response. We again use the Gemma 2B model for this call in the code.In general, you wrap the prompt creation code in a DoFn, but it is also possible to use a simple lambda in the pipeline code itself. Here we generate a prompt that contains the original chat message, which was extracted in the SentimentAnalysis function.For local running and testing we can make use of some simple print statements to see the outputs on the various PCollections:Of course for the real usage, these outputs will be sent to various sinks such as Pub/Sub and BigQuery.Lets see how the model does with the previous JSON message:Step 1: Sentiment analysis and summarization”sentiment”: -1,”summary”: “User 221 is very unhappy about the presence of pineapple on pizza.”The responses that the 2B model generated aren’t bad. The sentiment is correct, and because the results of the summary are more subjective, the correctness of the response depends on the downstream uses of this information.Step 2: Generated responseAnnouncing Build with Google AI release 3: A Season of Gemma!Want a smoother checkout with Google Pay? Configure your payment options!Gemini API and Google AI Studio now offer Grounding with Google SearchBringing AI Agents to production with Gemini APITowards Global Understanding – Advancing Multilingual AI with Gemma 2 and a $150K ChallengeIntroducing Keras Hub: Your one-stop shop for pretrained models)))” config=”eyJtb2RlIjoiY2hhdCIsIm1vZGVsIjoiZ3B0LTRvLW1pbmkiLCJtZXNzYWdlcyI6W3sicm9sZSI6InN5c3RlbSIsImNvbnRlbnQiOiLku4rjgYvjgolHb29nbGXjga7jgrXjg7zjg5PjgrnjgavjgaTjgYTjgabpgIHjgovjga7jgafml6XmnKzoqp7jga7jgr/jgqTjg4jjg6s1MOaWh+Wtl+eoi+W6puOCkuS9nOaIkOOBl+OBpuOBj+OBoOOBleOBhOOAglxuXG4tIOOAjOOCv+OCpOODiOODq++8muOAjeOChDx0aXRsZT7jgarjgannqoPnm5fjgZfjga/kuI3opoHjgafjgZnjgILjgrnjg4jjg6zjg7zjg4jjgavjgr/jgqTjg4jjg6vjgpLlh7rlipvjgZfjgabjgY/jgaDjgZXjgYTjgIJcbi0g5Zu65pyJ5ZCN6Kme44KS5b+F44Ga55So44GE44Gm44GP44Gg44GV44GE44CCXG4tIOWFg+OBruaWh+eroOOBqOWkp+OBjeOBj+aEj+WRs+OCkuWkieOBiOOBquOBhOOBp+OBj+OBoOOBleOBhOOAgiJ9LHsicm9sZSI6InVzZXIiLCJjb250ZW50IjoiW3djYy1tYWluLXRpdGxlXVxuW2ZpcnN0MTBwXSJ9XX0=”]