A data exploration project using data from: https://www.craigoates.net
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

94 lines
3.2 KiB

#+options: ':nil *:t -:t ::t <:t H:3 \n:nil ^:t arch:headline author:t
#+options: broken-links:nil c:nil creator:nil d:(not "LOGBOOK") date:t e:t
#+options: email:nil f:t inline:t num:t p:nil pri:nil prop:nil stat:t tags:t
#+options: tasks:t tex:t timestamp:t title:t toc:t todo:t |:t
#+title: CO-Date: README
#+date: \today
#+author: Craig Oates
#+email: craig@craigoates.net
#+language: en
#+select_tags: export
#+exclude_tags: noexport
#+creator: Emacs 29.0.60 (Org mode 9.6.1)
#+cite_export:
#+export_file_name: ./exported/readme.html
* <2023-03-26 Sun> Project Information
This is a data exploration project using data from [[https://www.craigoates.net][craigoates.net]]. It goes about
this using the [[https://en.wikipedia.org/wiki/Literate_programming][Literate Programming]] Approach. If are you unfamiliar with
Literate Programming, you might find the lack of /code/ files unusual but don't
worry. This is normal when approaching programming this way. Also, if you are
unfamiliar with the /.org/ file extension, think of it as a expanded version of a
markdown (.md) file. The /code blocks/ in these .org files are what are executed
and the text surrounding them bring extra context and help explain what the code
is doing.
The project assumes familiarity with the following:
- Emacs
- Org-Mode
- Org-Babel
- Common Lisp
- Bash
- Linux (Ubuntu/Debian)
/The code in this project should run on a Mac and Windows, it just hasn't been
tested on those systems./
To get the project onto your machine,
#+begin_src shell :results code
cd <PATH TO PLACE YOU WANT TO CLONE REPO. TO>
git clone https://git.abbether.net/craig.oates/co-data.git
cd co-data
#+end_src
Let the exploration begin...
* <2023-03-26 Sun> Project/Environment Set-up
Before you start opening the other files, you need to make sure you have set-up
you environment properly -- after a fresh clone of the repository.
1 year ago
#+begin_src shell :results silent
# Make sure you are at the project's root.
mkdir output exported
#+end_src
Your version of the repository should look something like the following,
#+begin_src shell :results code
tree -L 2
#+end_src
#+RESULTS:
#+begin_src shell
.
├── artwork.org
├── data
1 year ago
   └── co-production-2023-03-21.db
├── exported
├── LICENSE
├── output
└── README.org
1 year ago
3 directories, 4 files
#+end_src
- =data/= contains all the /input/ data you want to process and you shouldn't need
to write anything to here -- only read.
- =output/= is where all the (transformed) data from =data/= goes; Consider this the
project's /workbench/.
- =exported/= is used for exporting/converting the .org files (E.G. into PDF or
HTML files); The contents of this directory will be ignored by ~git~ so it does
not clog-up the commit history. I tend to view it as a place to store files
you intend to share with others in a more conventional format.
In an attempt to reduce the amount of duplicated information, the contents of
=data/= will be expanded upon in the other .org files and not here.
1 year ago
All versions of the database to [[https://www.craigoates.net][craigoates.net]] *will not be included in the
repository*. I will export the data to =data/=, typically as CSV files, with the
non-public data removed and use that instead of the database.