Best coding language to learn for data analysis?
@Owlbert which one is easiest for beginners/ most needed in a professional environment
@PoliticalVoyeur Python has a ton more resources online and for free. Good place to start. Maybe hold off on R for now.
@PoliticalVoyeur @Owlbert this is entirely contingent on the type of data analysis you're doing, what the rest of the group uses, that sort of thing. Basically, if you're working with a lot of statisticians and using more "statistical" methods, #R is going to be the right idea. If you're working with ML types, Python might be better.
@gzthompson @PoliticalVoyeur Thank you! I appreciate the clarification. :smiley:
@gzthompson @PoliticalVoyeur @Owlbert Adding a bit: if you're a beginner, I suggest you start with #Python
@red @PoliticalVoyeur @Owlbert I would also note that, if what you're doing is taking already cleaned data, maybereshaping it, and dumping it into a model, your "coding skills" don't need to be *that* great. But in that case you need to be heavy on statistics knowledge. However if you don't know the theory...
@Owlbert @gzthompson
Let's say, hypothetically, I am using data sets based off of demographic information of voter roles. "R" or "Python" would you think?
@PoliticalVoyeur I don't know what you're planning on doing to the data. So generic advice is Python because it's easy to learn how to do what you want, unless I know what you're doing with the data can just be dropped into R or some R package easily and it will spit out what you want.
@PoliticalVoyeur The great strength of R is that there are a lot of packages that "do what you want" if you know what you're doing. The great weakness of R is that it's a quirky language, can be very slow, and was designed to specifically work for statistics.
@PoliticalVoyeur As a relative newbie to this space, it seems that the big three in terms of usage appear to be R, python (as they have a lot of built in features that make analysis relatively easy) and Scala (on the data eng./streaming side of life. Hive might also fit here). For performant heavy number crunching I think you would also need a system language like C or Erlang (for its easy handling of concurrency)
@PoliticalVoyeur R or Python come to mind.