Daanalytics

Introducing Snowflake Worksheets for Python

Recently Snowflake made working with Python from within Snowflake a little bit better. ‘Snowflake Worksheets for Python’ is in Public Preview, available for use for everybody. Time to find out how it works.

Generate Faker Data

I am planning a series of blogpost regarding Snowflake Data Governance in combination with Alation. Therefore I need some demo data. Recently I saw a blogpost of my fellow Snowflake Data Superhero; Maja Ferle about generating Sample Data using the faker Library within Snowpark. I took that blogpost as a starting point to generate some fake Customer Data using Snowflake Worksheets for Python.

Enable Anaconda Python Packages

To be able to use the Snowflake Worksheets for Python feature, you first need to review and accept the Anaconda Terms of Service in Snowsight. Therefor you need to follow the next steps:

  1. Select the ORGADMIN-role
  2. Go to Admin, Billing & Terms
  3. Enable Anaconda Python packages
  4. Acknowledge & Continue
Enable Anaconda Python Packages

Packages available in Anaconda

First you have to make sure that the required packages are available in Anaconda, by either checking the documentation or executing the following query:

If you want to select specific packages and their versions, execute the following query. In this example we are looking for; ‘faker’ and ‘pandas’.

There might be a need to include packages that are not available from Anaconda. According to the documentation, you can add a Python File from a Stage to a Worksheet.

Preparing Python Worksheet

Before you start developing in the Snowflake Worksheets for Python, the worksheet needs to be prepared.

  1. Select the Python Worksheet
  2. Select a database
  3. Select a warehouse
  4. Define how yo want to run the worksheet
    • set the ‘Handler’ (the function to be called when executing the worksheet)
    • set the ‘Return type’ (the type of result returned by the ‘Handler’)
  5. Select the required packages
    • first select the package, then import the package in the worksheet
  6. Try to run the sample code to make sure things work.
Preparing Snowflake for Python Worksheet

Developing in Snowflake Worksheets for Python

When starting a new Python Worksheet, the sheet is filled with some sample code. This code can easily be changed according to your own needs. The code I needed to generate a simple Customer tabel is on GitHub.

  1. Add the code to the Python Worksheet and run the code
  2. View the results
  3. In this case the script (re-)creates and loads a table
    • view the results by selecting from the newly created table
Developing in Snowflake Worksheets for Python

Note: If you run code via the Snowsight UI there will be added some additional execution time +/- 10 to 20 secs extra as it executes as a temporary Stored Procedure each run. If you register it as a Permanent Stored Procedure it will eliminate this extra execution time.

Deploy the code as a Stored Procedure

With a simple click of a button your Python script can be deployed as a Permanent Stored Procedure.

  1. Click ‘Deploy’
  2. Name the Stored Procedure
    • Name
    • Description
    • Overwrite existing (Y/N)
  3. Call procedure
  4. Verify results in table
Deploy the Python code as a Stored Procedure

Closing Statements

Python Worksheets for Snowflake are now in Public Preview which means it’s available to everyone. A native Python Code Editor in SnowSight including IntelliSense with auto-complete, Sowpark and third-party library support both via either Anaconda or manual upload of custom packages.

Deploying the Python scripts as Python Stored Procedures is easy and makes it possible to automatically schedule them via Snowflake Tasks.

More details in the Snowflake documentation; ‘Writing Snowpark Code in Python Worksheets’.

Till next time.

Daan Bakboord – DaAnalytics

Bekijk ook:

Snowflake BUILD Amsterdam – Cortex Analyst Hands-On Lab

Last Wednesday I had the privilege to organize and give a Snowflake BUILD Hands-On Lab. Snowflake BUILD is Snowflake’s yearly event for Developers, Data Scientists, Data Engineers, and all Data Professionals full “of exclusive product announcements, “how to” technical sessions, and hands-on labs focused on Snowflake’s latest innovations. Learn how to build data pipelines, models and apps in the age of generative AI and LLMs.”

Lees verder »
Why didn't we see this coming?

Why didn’t we see this coming?

Early this month I attended the two days International Master Class in Strategic Intelligence executed by Rodenberg Tillman & Associates. If you’re really determined to move beyond simply gathering data and truly understand its strategic impact, this Master Class is designed for you. The Master Class is built around the Six Building Blocks™, ensuring comprehensive coverage of the critical aspects of Strategic Management and Intelligence. It’s perfect for business professionals who aim to excel by integrating Strategic Intelligence into their everyday practices, gaining the insights necessary to not only anticipate, but shape the future.

Lees verder »
Amsterdam User Group Meeting October 2024

Snowflake Dutch User Group – October 2024

Last night I had the privilege to organize a Snowflake ❄️ User Group in Snowflake’s Amsterdam Office.

Johan van der Kooij shared his experiences regarding optimizing Snowflake from a cost & performance perspective. He shared practical hints, as well as example queries, that you can use to optimize your Snowflake environment.

Lees verder »