Pay attention to union function of pyspark

In SQL the UNION clause combines the results of two SQL queries into a single table of all matching rows. The two queries must result in the same number of columns and compatible data types in order to unite. Any duplicate records are automatically removed unless UNION ALL is used. UNION can be useful in data warehouse applications where tables aren’t perfectly normalized.[2] A simple example would be …

How two read SAS data with PySpark

For some reason, I have to convert sas data to hdfs then analyse with  pyspark. after some research I found spark-sas7bdat is the best solution for me. This package allows reading SAS files in local or distributed filesystem as Spark DataFrames. Schema is automatically inferred from meta information embedded in the …

Python For SAS Users

I am a coder, and focus on machine learning now. for some reason, I learned SAS 2 years ago, for a coder,  …