data science in clojure

MentorDaniel Slutsky (daslu)
Project Websitehttps://scicloj.github.io/
Project Repositoryhttps://github.com/scicloj/clojisr
Suitable for Beginners?yes
Tagsclojure R data science data visualization
Stateaccepted
Applications (1st Choice)7 (6 submitted | 1 in-progress)
Applications (2nd Choice)3 (3 submitted | 0 in-progress)
Code of Conduct
LicenseEPL 2.0

Project Description

Clojure is a programming language that encourages simplicity,
data-driven programming and an interactive experience of coding and
data exploration.

ClojisR is a bridge that allows Clojure to use R, a popular
programming language for statistics. It seeks to combine Clojure's
unique approach towards data exploration with R's huge collection of
data visualization and statistical libraries. It is in beta stage --
already used by several people in their everyday work, but not yet
stable enough for production.

This is part of a broader effort taking place these days, to improve
Clojure's data science stack. A large part of that effort is happening
in an open source community called Scicloj.

The goal of this RGSoC project is to take an active part in improving
Clojure's data science stack, especially in aspects of usability,
user-facing features and documentation.

People involved:
Submitting this project, I am the main author of Clojisr, and a
co-organizer of the Scicloj community.
Through the process, I continuously ask for the feedback of the Scicloj community.
A couple of other community members will probably be able to join as coaches.

Code of conduct:
I have read the RGSoC code thoroughly, and I feel that it matches the values of the Scicloj community. We would love to use it for the project, and inform everyone
involved.

Project's Requirements

To join the project, it is preferable to have the following:
- knowledge of either Clojure(1) or R(2), and at least 1 year experience of
programming in the language;
- an appetite to learn the other language, and a very open mind about it;
- at least some basic knowledge in statistics (say, 1st-year university courses);
- some experience in data-research tasks (that is, getting a dataset, exploring it, answering some questions about it).

(1) say, feeling comfortable about chapters 1,3,4,5,6,12,13 of "Clojure
for the Brave and True" by D. Higginbotham
https://www.braveclojure.com/clojure-for-the-brave-and-true/

(2) say, feeling comfortable about the book "R for Data Science" by
G. Grolemund and H. Wickham
https://r4ds.had.co.nz

If you do not have exactly this background, but are still interested in
Clojure and data science, then let us talk and see if this project
can be adapted to fit your journey.
If you are interested in specific parts of the Clojure data science stack (e.g., Clojure-Python interop, data visualization, probabilistic programming), let us talk and think about it.

In any case, I would love to discuss it further.

Tasks And Features

The idea is to work in a case-driven fashion:
* The team will choose data science problems, matching the taste and experience of the team members. These can be either new problems, or existing stories, already written by others, e.g. Kaggle notebooks.
* The team will implement problem solutions using Clojisr and other libraries.
* Through the work, we will reason about the usability of the
libraries used, what functionality may be missing, and what it could improve.
* We will discuss and prioritize what can be actually improved in the libraries.
* Implementing the missing pieces may be done either by the usual library
maintainers, or by the team. In any case, the team will be
involved in the thought process.
* Team members will join broader discussions of the Scicloj community,
and share their experience, thoughts and ideas.
* The team will create an online open-source book with the collection
of examples implemented.

This project will have an important role in the way the Scicloj
community reasons about its progress, goals and priorities.

The fruit of the process is twofold:
* continuously realizing what should be improved in the evolving stack;
* creating a collection of examples solving realistic use cases.


If you are thinking of applying to this project for RGSoC 2020 and have any questions, feel free to contact the project mentor by leaving a comment below or using the following channels:

Comments

You must be logged in to comment on this project.


Daniel Slutsky, Monday, April 6, 14:59 UTC

Oh, now I see your application, which I have been reading in the last couple of days.
Nice!

By the way, I could not access this link that you mentioned in your application:
https://github.com/Jadamoureen/YonjaPaymentPlatform
If you wish, you can write another link here.

It is not required to get involved in the project before the summer, but if you wish to start exploring, a good first step would be starting to learn clojure, as suggested in the comments below.

Thank you so much for your interest in the project!


Moureen Caroline Ochieng , Monday, April 6, 13:20 UTC

Hello Daniel, we applied as team Manifest_Data before the deadline, that's to say before the 31 st of March 2020, though I have written to you after the deadline we had already applied.


Daniel Slutsky, Monday, April 6, 13:08 UTC

Dear Moureen and Sandra,
thank you so much for reaching out.

As far as I know, it is already too late to apply as students Rails Girls Summer of Code -- see the timeline towards the bottom of this page:
https://railsgirlssummerofcode.org

Did you send an application last week?

Of course, if you are still interested in getting involved, without applying as a team, that would be lovely, and I will be very happy to discuss it further. Could you tell more about what may be interesting to you?


Moureen Caroline Ochieng , Monday, April 6, 12:59 UTC

Hey Daniel Slutsky, I am Moureen. My teammate Sandra And I would love to participate and contribute towards your project. We both are interested in Data Science. We have gone through the resources and would like to start our contribution. Please let us know how we can be helpful to you.


Siddhant Pathak, Friday, March 27, 15:13 UTC

Hi Daniel Slutsky, I would like to contribute in expanding the data science problem solving capabilities across various cross platform languages. This would enable the future data scientists to think on problem solving which would be more language independent.


Daniel Slutsky, Friday, March 27, 14:14 UTC

Hi Siddhant Pathak, thank you so much for reaching out!

At the moment I am not aware of teams that have only one coach, but several teams with two coaches told me they would be happy to have a third coach.

Let us keep discussing this.

If you wish, it would be great to hear more about why this project is interesting to you.


Siddhant Pathak, Friday, March 27, 13:14 UTC

Hello Team 'data science in clojure' . I am a data scientist at a leading gaming industry based in India. The skills required for this project match with my domain. So I would like to coach / mentor your team. Feel free to DM me through linkedin at linkedin/siddhant96


Daniel Slutsky, Wednesday, March 25, 01:07 UTC

Hello Saman and Ayman! Thank you so much for your kind message!

I would love to hear more about what may be interesting to you in this project.

The first step would be to start getting comfortable with Clojure.
Have you played a little with Clojure? How do you find it?
Please tell if you find anything strange or difficult -- I will be happy to help.


Saman Gaziani, Tuesday, March 24, 20:38 UTC

Hey! I am Saman. My teammate Ayman And I would love to participate and contribute towards your project. I have prior experience with R applied to solve statistical questions. We both are interested in Data Science and have completed certification for Data Science in Python by Data camp. We have gone through the resources and would like to start our contribution. Please let us know how we can be helpful to you.


Lyne Naluwaga, Saturday, March 21, 15:13 UTC

Dear tea-n-biccies, thank you so much for the comment. I and my Partner Eva are finding problems in answering some questions about the application. Is there a place we can go for this?


tea-n-biccies RGSoC, Friday, March 20, 13:43 UTC

Dear RGSoC applicants - we have added a new FAQ page to the website. Please check this out before asking mentors your questions, as we may already have an answer for you :)
https://railsgirlssummerofcode.org/students/faq

Further details of how to apply to RGSoC (by 23:00 UTC on 30 March 2020) can be found at https://railsgirlssummerofcode.org/students


Daniel Slutsky, Friday, March 20, 00:10 UTC

Hello Niharika and Divija! Thanks for your kind message.
If you are interested in conributing, I think it is a good idea to feel a little comfortable with the language first. After you feel comfortable with some of the book chapters, we may discuss some further ways to explore, with a small problem. If you are reading about Clojure, I think it would be a good idea to have a development environment where you can try things and explore further.
Please tell if you run into anything strange or difficult. : )


Daniel Slutsky, Friday, March 20, 00:07 UTC

Hello Eva and Lyne, thank you so much for your interest in this project! Please tell if you wish to explore a little, to see if you like the project, or if you have any questions.


Niharika Gali, Thursday, March 19, 06:41 UTC

Hi! I'm Niharika. My teammate Divija and I would love to contribute. We don't have prior experience in Clojure but are well-versed data scientists who have done extensive internships and projects in the area. We've also taken a full-blown statistics course in uni. We've briefed over the comments section and will get started on reading the resources. Please do let us know how we can contribute!


Eva Nanyonga, Thursday, March 19, 03:17 UTC

Hello, I am Eva and my team-mate is Lyne from team Apollo11. We are excited to be participating in RGSoC, are both on Data Science paths and have a great interest in Clojure, a language in which your project is written. We have majorly been working in Python for Data Science but we are now ready to spread our wings into being diverse developers. We would very much like to contribute to your project and are excited to journey with you.


tea-n-biccies RGSoC, Monday, March 9, 11:11 UTC

Hi everyone - the RGSoC team here :)
Just a reminder for all applicants that student applications are open until 23:00 UTC on 30 March 2020.
For information on how to apply as a student so you can work on this project with RGSoC, please read the guidance at https://railsgirlssummerofcode.org/students


Daniel Slutsky, Friday, March 6, 07:30 UTC

This list by yogthos is a good collection of resources to begin with Clojure:
https://gist.github.com/yogthos/be323be0361c589570a6da4ccc85f58f

After you set up a development environment with your preferred editor (recommending vscode if you do not have any preference), it is a good idea to read chapters 1,3,4,5,6 of "Clojure
for the Brave and True" by D. Higginbotham
https://www.braveclojure.com/clojure-for-the-brave-and-true/


vellanki gayathri, Thursday, March 5, 14:30 UTC

I am excited to learn and solve problems using Clojure. I will refer to the Resources you shared and will be discussing with you the whole Journey.


Daniel Slutsky, Thursday, March 5, 13:53 UTC

Sorry for the typo in your name, Gayathri.


Daniel Slutsky, Thursday, March 5, 13:53 UTC

Dear Garathri, thank you so much for your interest in this project.

You may find this blog post useful. It tells a bit more about the project, its tasks, and why it is so important to us at scicloj.
https://scicloj.github.io/posts/2020-03-05-rgsoc2020/

The main task will be to choose data-science problems that you find interesting, and solve them in Clojure. The other tasks -- discussing, prioritizing and solving issues in the libraries, making a coherent story as an open source book, and engaging in the community -- will be the fruit of the work on problems.

Of course the project itself will begin at the summer, after the teams are chosen. But if you wish to explore the field earlier, I would be happy to discuss it as much as you wish.

If you are new to Clojure, then I recommend to begin by learning and playing with the language for some time.
You can find some setup instructions for different tools here at the Practialli website:
https://practicalli.github.io/clojure/development-tools/
The development environment in Visual Studio Code, called Calva, is probably the most comfortable to begin with, unless you are already used to some other toolsets. Please tell if you need some help setting it up.
You can also try Clojure online here:
https://repl.it/languages/clojure

I would love to hear more about your background and interests in other languages and tools, so that I can hopefully help with better advice of how to begin.

You can also contact me in person as "Daniel Slutsky" at the Clojurians Zulip:
https://clojurians.zulipchat.com
or at twitter:
https://twitter.com/daslu_


vellanki gayathri, Thursday, March 5, 13:28 UTC

I am Gayathri Vellanki and my teammate is Niharika M.We are from team Linux Lions and looking forward to participate in RGSoC.We really Interested In your project. Though all the details mentioned give clear understanding of the project but can you guide us about - at initial phase what tasks we are supposed to do. And how and where we can get started?