Currently browsing tag

pyspark

Pay attention to union function of pyspark

In SQL the UNION clause combines the results of two SQL queries into a single table of all matching rows. The two queries must result in the same number …

How two read SAS data with PySpark

For some reason, I have to convert sas data to hdfs then analyse with  pyspark. after some research I found spark-sas7bdat is …

How to set virtualenv for a crontab?

Recent I was asked  to batch run Python script in a virtualenv and also run in crontab. for example,  pyspark_hello_world.py

I …