Data Architect

Kaztronix

(Mount Laurel, New Jersey)
Full Time
Job Posting Details
About Kaztronix
Kaztronix was founded in 2002 with a vision to impact the staffing industry by making the finest talent and consulting solutions easily accessible to organizations spanning numerous industries. Simply put, we source and qualify the best industry talent available, providing flexible solutions to meet the business objectives of our clients.
Responsibilities
* Assume you have a dataset that contains 5 columns. Each column contains 100K rows of numerical values that range from 7.16206E+12 to 7.16206E+12 * Data set file format is CSV * For testing you can produce the dataset using a random number generator * Read in the data * Sort the data ascending in each column * For each column, compute the 5 Number Summary (min value, Quartile 1, median, Quartile 3, max value) * Persist the results to a Cassandra table * Show the CQL necessary to create the table that will store results. Include, primary key, composite key, indexes, cluster ordering as needed
Ideal Candidate
* Cassandra (2.1) data modeling and CQL (3.2). Administration would be a big plus * Developing Spark (1.4.2) analytic jobs in Scala, SQL. Administration would be a big plus * The DataStax (4.8.4) distribution of Cassandra/Spark * Job workflow manager like Azkaban * Gradle, Maven * Git, GitHub * CI/CD servers like Jenkins/Go CD * Linux

Questions

Answered by on
This question has not been answered
Answered by on

There are no answered questions, sign up or login to ask a question

Want to see jobs that are matched to you?

DreamHire recommends you jobs that fit your
skills, experiences, career goals, and more.