RubyPDF Blog machine learning How to set virtualenv for a crontab?

How to set virtualenv for a crontab?

Recent I was asked  to batch run Python script in a virtualenv and also run in crontab.

for example,  pyspark_hello_world.py

import sys
from pyspark import SparkContext
from operator import add

sc = SparkContext()
data = sc.parallelize(list(sys.argv[1]))
counts = data.map(lambda x: (x, 1))\
.reduceByKey(add)\
.sortBy(lambda x: x[1], ascending=False)\
.collect()
for (word, count) in counts:
print("{}: {}".format(word, count))
sc.stop()

I want to run it in bash, after some research , I got a solution,

#!/bin/bash    
source /Users/steven/.pyenv/versions/3.6.4/envs/ts/bin/activate

# virtualenv is now active, which means your PATH has been modified.
# Don't try to run python from /usr/bin/python, just run "python" and
# let the PATH figure out which version to run (based on what your
# virtualenv has configured).

python "$@"

#another way
#echo 'source /Users/steven/.pyenv/versions/3.6.4/envs/ts/bin/activate; python /Users/steven/tmp/hello.py' | /bin/bash

name it “runpy”, now I can easily run it in bash

./runpy pyspark_hello_world.py "hello world"

or use it in crontab

0    9    *    *    *    /path/to/runpy /path/to/pyspark_hello_world.py

btw, if under Windows,

we can add the following code at the top of pyspark_hello_world.py

exec(open("D:\\venv\\Scripts\\activate_this.py").read(), \
{'__file__': "D:\\venv\\Scripts\\activate_this.py"})

and runpy.bat is like this,

C:\ProgramData\Anaconda3\python.exe  d:\pyspark_hello_world.py "hello world"

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.