Python with Virtualenv in Zuar Runner
Runner is Zuar's data pipeline solution. Learn about importing a Python package we don't have installed on Runner, such as pandas or numpy.
This article covers the importing of a Python package that we don't have installed on Zuar Runner, such as pandas or NumPy. In this case, we need to use a virtual environment. The virtual environment will need to be created manually on your Runner before this will work. In this case I created a new directory in /var/runner/data/
called testenv
, then I ran virtualenv -p /usr/bin/python3 testenv/
, then source testenv/bin/activate
, and finally I used pip to install pandas and numpy.
With my virtual environment set up, the following script creates a small dataframe and writes it to a file at: /var/runner/data/also_written_by_python.txt
.
import numpy as np
import pandas as pd
s = pd.Series([1, 3, 5, np.nan, 6, 8])
f = open("/var/runner/data/also_written_by_python.txt", "w")
f.write(str(s))
f.close()
In order to run this script via the virtual environment, we'll execute it using python
from our virtual environment like this:
{
"cmd": "/var/runner/data/testenv/bin/python /var/runner/data/python_test.py",
"shell": true
}
The resulting file should look like this:
$ cat also_written_by_python.txt
0 1.0
1 3.0
2 5.0
3 NaN
4 6.0
5 8.0
About Zuar Runner
Pulling data into a single destination and normalizing that data, whether in the cloud or OnPrem, can be difficult for any organization. Zuar's Runner solution provides comprehensive ETL and automated pipeline functionality without the learning curve and cost of many other solutions.