Understanding Snowflake Cortex for Gen AI Applications with Sensitive Data

In 2017, Google’s revolutionary research paper, ‘Attention is All You Need,’ introduced transformer architecture, revolutionizing Natural Language Processing (NLP). Since then, open-source and proprietary LLM models have washed over the AI landscape.

Snowflake Cortex seamlessly integrates the transformer architecture Google discusses to enhance its NLP capabilities but takes it a step further. Instead of forcing users to become AI experts who can train and fine-tune (can take many months), deploy, scale, and monitor (incurs a considerable cost), Snowflake Cortex features aim to provide a host of industry-leading, pre-trained open-source language models.

Further, Snowflake provides its own managed LLMs, which serve the LLM application in most of its use cases. All these managed or open-source models can be used at a fraction of the cost, which is a significant step for Snowflake in democratizing AI.

Snowflake Cortex: Seamless AI Integration with Top-notch Governance

With Cortex, all LLMs are fully hosted and managed by Snowflake, eliminating the need for a complicated setup. Your data remains within Snowflake, ensuring optimal performance, scalability, and privacy.

Cortex’s Suite of Language Models

Snowflake Cortex offers its features through SQL functions and supports Python. Below is a summary of the available functions, with new AI models continuously being updated. The following models are available as of April 20, 2024:

Cortex Complete Functions – Prompts Required

  • Mistral-large: High-capacity language model
  • Mixtral-8x7b: Optimized performance model
  • llama2-70b-chat: Advanced conversational model
  • mistral-7b: Efficient mid-size model
  • gemma-7b: Versatile application model
  • Reka-flash: Multi-modal vision transformer

These can be customized to the output, such as Jason format, bulleted outputs, etc. Helps in reducing hallucinations/inaccurate answers with carefully crafted prompts. Snowflake offers the following open-source, commercially licensed models for Customizable LLM applications.

Cortex Managed Functions – No Prompt Required

Commonly Asked Questions to Understand Snowflake Cortex Better

Does Snowflake support RAG frameworks to reduce hallucinations?

Yes. RAG is a great framework that supports techniques for reducing hallucinations. For embeddings it uses e5-base-v2. You also need not use external databases like Pinecone, Chroma etc. to store the embeddings as Snowflake databases can store vectors/tensors.

RAG framework

Can one use Snowflake Copilot to build chatbots?

Yes. The Snowflake copilot is a text-to-SQL application using mistral-7b and also Snowflake’s proprietary SQL generation engine for enhanced performance. It can easily integrate with Streamlit natively to build chatbots. One can also build text to SQL with Streamlit chatbots, using Cortex complete functions.

Can one use Cortex functions in Snowpark container service?

Yes. Snowpark Container Service (SPCS) is so versatile that one can even build their own LLMs, fine-tune, and even apply RAG applications, all within Snowflake. One can also bring in any LLM, both open source and proprietary into SPCS.

Glossary: Snowpark is a new feature of Snowflake, the Data Cloud platform that enables you to securely and efficiently manage your data workloads across public clouds. 

Is it possible to use a propriety model like OpenAI GPTs with Snowflake?

Yes. Chat GPT can be used through external access integration though it is not part of Cortex. Many companies are happy to tap into the power of OpenAI’s pre-trained GPT proprietary models via its API, but many others refrain from sharing private data with OpenAI, let alone with Google or Microsoft which has partnered with OpenAI, integrating LLMs deeply across its product set. For example, medical records of patients is one such example of sensitive data.

Here, Snowflake Cortex’s Gen AI offerings are a much-needed feature.

Is it possible to use open-source BioGPT on Snowflake?

Presently, BioGPT is not part of the Cortex family, though Snowflake may include it in the future. However, one can use Snowflake’s SPCS offering with BioGPT, which is trained with medical records using SPCS. This saves the trouble of using general-purpose LLMs and fine-tuning them with medical records.

Let’s look at an example of Cortex’s AI capabilities that can detect and prevent potential security threats, with its additional feature to safeguard critical information and maintain trust within the life sciences sector.

Cortex AI Applications for the Life Science Industry

As an advanced AI model, Cortex significantly aids the life science industry by streamlining drug discovery, analyzing complex biological data, and accelerating research processes, leading to breakthroughs in healthcare and pharmaceutical innovations.

Supporting these applications, Cortex enhances collaboration among researchers, improves decision-making by providing data-driven insights, and fosters innovation through its ability to handle diverse and complex data sets.

Medical Imaging Analysis: Analyze medical images like X-rays, MRIs, CT scans, and mammograms to assist radiologists in detecting abnormalities, tumors, fractures, or other conditions with greater accuracy and speed.

Disease Diagnosis: AI is used to diagnose diseases by analyzing patient data such as symptoms, medical history, and lab results. It can also help doctors make faster and more accurate diagnoses.

Drug Discovery and Development: AI can analyze large datasets and predict the effectiveness of potential drug compounds, identify new drug targets, optimize drug design, and accelerate the drug discovery and development process.

Electronic Health Records (EHR): AI-powered NLP systems extract valuable information from unstructured clinical notes and EHRs, enabling healthcare providers to access and analyze patient data more efficiently and accurately.

Healthcare Fraud Detection: AI algorithms analyze healthcare billing data and claims to detect fraudulent activities, identify billing errors, and prevent healthcare fraud, saving healthcare organizations billions annually.

Healthcare Chatbots and Virtual Assistants: AI-powered chatbots and virtual assistants provide patients with personalized health advice, answer medical questions, schedule appointments, remind patients to take medication, and triage medical issues.

Learn more about our solutions that transform your approach to data management in life sciences, get insights here. Discover new possibilities with Snowflake, learn more.

About the Author

Ashwin Suresh

Ashwin Suresh

Prinicipal Consultant, Data & AI, Hexaware

Ashwin brings over 19 years of hands-on experience in Machine Learning and Artificial Intelligence, making him a go-to expert in the field. He's been instrumental in helping organizations unlock the true potential of their data, leading to game-changing results and measurable success. With a background that spans retail, banking, and supply chain, Ashwin is now driving innovation at Hexaware’s Snowflake Center of Excellence.

Read more Read more image

Related Blogs

Every outcome starts with a conversation

Ready to Pursue Opportunity?

Connect Now

right arrow

ready_to_pursue
Ready to Pursue Opportunity?

Every outcome starts with a conversation