Consider the code snippet
rdd=sc.parallelize([("1001","john","finance","20000"),("1002","Harry","finance",""),("1003","marry","finance","30000"),("1004","jim","finance","")])
The above dataset contains employee information. The schema is (employeeId, employeeName, department, salary)
Complete the code to find out how many employees have missing salary component? Choose three options.

Question

Consider the code snippet

rdd=sc.parallelize([("1001","john","finance","20000"),("1002","Harry","finance",""),("1003","marry","finance","30000"),("1004","jim","finance","")])

The above dataset contains employee information. The schema is (employeeId, employeeName, department, salary)

Complete the code to find out how many employees have missing salary component? Choose three options.

anonymous · Accepted Answer

Correct Solution - abd