Code to help with the re-arranging of my life in 2024.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

324 lines
13 KiB

#+title: Overhaul 2024 Journal
#+options: ':nil *:t -:t ::t <:t H:3 \n:nil ^:t arch:headline author:t
#+options: broken-links:nil c:nil creator:nil d:(not "LOGBOOK") date:t e:t
#+options: email:nil expand-links:t f:t inline:t num:t p:nil pri:nil prop:nil
#+options: stat:t tags:t tasks:t tex:t timestamp:t title:t toc:t todo:t |:t
#+date: \today
#+author: Craig Oates
#+email: craig@craigoates.net
#+language: en
#+select_tags: export
#+exclude_tags: noexport
#+creator: Emacs 29.1.90 (Org mode 9.7-pre)
#+cite_export:
* Setup Directories
This just sets up the work environments directories. You should only need to run
this the first time you work your way thought this file.
#+begin_src shell :results output
mkdir raw-data renders working-data
#+end_src
* Setup Python Virtual Environment (Properties Drawer)
:PROPERTIES:
:header-args:shell: :exports code
:header-args:elisp: :exports code
:header-args:ipython: :exports both
:header-args: :eval never-export
:END:
I've got some code [[https://git.abbether.net/craig.oates/literate-charting/src/branch/master/charting.org][here]] which has information about setting up Python to be used
in a literate programming context with org-mode.
#+begin_src shell
python3 -m venv venv
# To activate the virtual environment in your CLI...
source venv/bin/activate
#+end_src
#+begin_src shell
# Should only need this on first run.
touch requirements.txt
#+end_src
#+begin_src shell
# To activate the virtual environment in your CLI...
source venv/bin/activate
#+end_src
#+begin_src shell :results code
# To check results.
tree -L 1
#+end_src
#+begin_src elisp :results silent
;; Activate at start of coding session...
;; Change path to where you have saved this repository.
(pyvenv-activate "~/dev-shed/overhaul2024/venv")
(pyvenv-mode)
#+end_src
You can, also, do this via =m-x pyvenv-activate=. You should be able to put
~venv~ in the prompt and be good to go. Use =m-x pyvenv-mode= to see the
currently active virtual-environment in the mode-line.
#+begin_src shell :results silent
# Snippets to install and store Python packages (in venv).
# You'll probably be better running these at the CLI.
pip freeze > requirements.txt
pip install -r requirements.txt
#+end_src
* Scrape Right Move Data Manchester (2024-02-18 Sun)
Found some code at [[https://scrapfly.io/blog/how-to-scrape-rightmove/#full-scraper-code][ScrapFly]] for scraping data from [[https://www.rightmove.co.uk][Right Move]]. I put it back
together (in was in code blocks for the purpose of the blog post) and modified
it to what I need. Ideally, I would have the code within this file but the code
is already written and it easier to just run the code as a script file (from
here). *You will need to change the ~location~ variable in the ~run~ function to
search and scrape data from other parts of the country.* The data is stored in
=raw-data/=, as a JSON file, and will require further parsing and processing.
#+begin_src shell :results silent
python rightmove.py
#+end_src
* Setup Common Lisp Environment
*Run ~m-x slime~ before running the following code.* And, make note of the
~:session~ attribute. It allows you to use the code in the code block to be use
in other code blocks which also use the ~:session~ attribute.
#+begin_src lisp :session :results silent
(ql:quickload :com.inuoe.jzon)
(ql:quickload :plot/vega)
(ql:quickload :lisp-stat)
(ql:quickload :data-frame)
(ql:quickload :str)
#+end_src
* Convert Right Move JSON Data (2024-02-18 Sun) for Manchester to CSV File
This code goes through the JSON file, returned by Right Move, and creates a CSV
file of the data I'm most interested in. It's, also, prep. work for using Lisp Stat.
#+begin_src lisp :results silent :session
;; Adjust path to match the file you want to process.
(let ((data (com.inuoe.jzon:parse
#P"raw-data/2024-02-18_21-48-46_right-move-manchester.json")))
(with-open-file (stream
#P"working-data/2024-02-18_21-48-46_right-move-manchester.csv"
:direction :output
:if-exists :supersede)
(format stream "ID, Price, Frequency, Bedrooms, Bathrooms, Display Address, Students, Latitude, Longitude, URL~%")
(loop for homes across data
do (format stream "~a, ~a, ~a, ~a, ~a, ~s, ~a, ~a, ~a, https://www.rightmove.co.uk/properties/~a~%"
(gethash "id" homes)
(gethash "amount" (gethash "price" homes))
(gethash "frequency" (gethash "price" homes))
(gethash "bedrooms" homes)
(gethash "bathrooms" homes)
(gethash "displayAddress" homes)
(gethash "students" homes)
(gethash "latitude" (gethash "location" homes))
(gethash "longitude" (gethash "location" homes))
(gethash "id" homes)))))
#+end_src
* Explore CSV Data for Right Move Manchester (2024-02-18)
#+begin_src lisp :results silent :session
(defvar *rm-manchester*
(lisp-stat:read-csv #P"working-data/2024-02-18_21-48-46_right-move-manchester.csv")
"Data from the CSV file (after it has been processed from the JSON file).")
(lisp-stat:defdf *rm-manc-df* *rm-manchester*)
#+end_src
Having had a quick look at the CSV file, I noticed some rent prices above
£10,000 a month. This is just way outside my budget, so I could do with
filtering them out. There are, also, some entries which are car parks, so I'll
need to filter them out as well. I could, also, do with separating out the
weekly and monthly rent prices so I look at them fairly. Filtering out the
student accommodation is something I should do, as well. I'm not a student
anymore, and haven't been for quite a long time.
#+begin_src lisp :session :results silent
(lisp-stat:write-csv
(lisp-stat:filter-rows *rm-manc-df* '(and
(string= "weekly" frequency)
(string= "NIL" students)))
#P"working-data/2024-02-18_21-48-46_right-move-manchester-weekly.csv"
:add-first-row t)
#+end_src
#+begin_src lisp :session :results silent
(lisp-stat:write-csv
(lisp-stat:filter-rows *rm-manc-df*
'(and
(string= "monthly" frequency)
(> 1500 price)
(string= "NIL" students)
;; Filtering for the car park entries.
(< 275 price)
(not (str:contains? "car " display-address :ignore-case t))
(not (str:contains? "park " display-address :ignore-case t))
(not (str:contains? "parking " display-address :ignore-case t))))
#P"working-data/2024-02-18_21-48-46_right-move-manchester-monthly.csv"
:add-first-row t)
#+end_src
I've stored the /cleaned/ results in separate CSV files because I will need to
work through them away from here – in terms of reviewing the listing on the
Right Move website. Having said that, I am going to do a little bit of /summary/
work so I can see what I need to aim for salary-wise if I’m to get a new job.
#+begin_src lisp :session :results silent
(lisp-stat:defdf *rm-manc-filt-df*
(lisp-stat:read-csv #P"working-data/2024-02-18_21-48-46_right-move-manchester-monthly.csv"))
#+end_src
#+begin_src lisp :session :results drawer
(lisp-stat:summarize-column '*rm-manc-filt-df*:price)
#+end_src
#+RESULTS:
:results:
121 reals, min=945, q25=1098.75, q50=1246.4286, q75=1358.9375, max=1495
:end:
- Monthly Average: £1236.12 (done in CSV file outside of this file)
The /weekly/ version of the Right Move CSV file only has six entries. I'm not
going to bother using Lisp-Stat here. I'm just going to summarise manually.
| Weekly (£) | Monthly (£) | Notes |
|------------+-------------+---------------|
| 348 | 1392 | |
| 238 | 952 | |
| 592 | 2368 | Out of budget |
| 458 | 1832 | Out of budget |
| 458 | 1832 | Out of budget |
| 443 | 1772 | Out of budget |
#+TBLFM: @2$2=4*$1::@3$2=4*$1::@4$2=4*$1::@5$2=4*$1::@6$2=4*$1::@7$2=4*$1
Looks like I'm keeping only two of the weekly entries. Deleted the 'Out of
budget' entries from
=working-data/2024-02-18_21-48-46_right-move-manchester-weekly.csv= file.
Combining the monthly and week averages to get the average rent price for places
to live in Manchester, via Right Move,
#+begin_src lisp :results output raw
(let* ((weekly-avg (/ (+ 1392 952) 2))
(wk-mnt-avg (/ (+ 1236 weekly-avg) 2)))
(format t "Weekly rent average with Right Move (converted to monthly): £~a~%" weekly-avg)
(format t "OVERALL monthly rent for Manchester with Right Move: £~a~%" wk-mnt-avg))
#+end_src
- Weekly rent average with Right Move (converted to monthly): £1172
- OVERALL monthly rent for Manchester with Right Move: £1204
So, to sum up the properties on Right Move,
| Source | Min. (£) | Max. (£) | Avg. (£) |
|------------+----------+----------+----------|
| Right Move | 945 | 1495 | 1204 |
I formatted the =Annual= values manually. Kept the table formulas for reference.
| Quantity Property | Annual |
|-------------------+---------|
| Minimum | £11,340 |
| Maximum | £17,940 |
| Average | £14,448 |
#+TBLFM: @2$2=945*12::@3$2=1495*12::@4$2=1204*12
* Summary of Right Move Data
I basically need to earn around £15,000/yr just to cover my living costs. This
does not include food, clothes, travel and socialising. Realistically, I'm
looking at aiming for £20,000/yr if I go this route.
To get the bar chart created below, I manually add a column called ~ROW_ID~ and
assigned consecutive integers to the
=working-data/2024-02-18_21-48-46_right-move-manchester-monthly.csv= file. The
chart doesn't include the /weekly/ entries. There are only two entries which
applied here and they don't affect the average in a significant way. Including
the weekly rates, the average goes from £1236.12 to £1204 (difference of
£32/month).
#+begin_src lisp :session :results file
(vega:defplot monthly-rent
`(:title "Advertised Monthly Rent Rates for Manchester (18/02/2024) from Right Move (Filtered High & Low Entries)"
:description "Some text"
:data ,*rm-manc-filt-df*
:width 1600
:height 900
:layer #((:mark (:type :bar :fill "steelblue")
:encoding (:x (:field :ROW-ID :title "Assigned Id.")
:y (:field :PRICE :title "Rent (£)" :type :quantitative)
:legend (:direction :vertical)
:tooltip (:field :display-address)))
(:mark (:type :rule :color "darkorange" :size 3)
:encoding (:y (:field :PRICE :type :quantitative :aggregate :average)
:tooltip (:field :PRICE :type :quantitative :aggregate :average))))))
(vega:write-html monthly-rent #P"renders/2024-02-18_21-48-46_right-move-manchester-monthly.html")
#+end_src
#+RESULTS:
[[file:renders/2024-02-18_21-48-46_right-move-manchester-monthly.html]]
#+attr_latex: :width 600
#+attr_html: :width 800
[[file:renders/2024-02-18_21-48-46_right-move-manchester-monthly.png]]
* Add Tax and Daily Rates Breakdown
This is the first data-set I worked on, and since then I've written more code to
help parse and summarise the data. So, I'm going to add in here what I've been
doing in the other Manchester files.
#+begin_src calc :results output
14448 + 5000
#+end_src
#+RESULTS:
: 19448
Using £19,448 as the starting point (see [[file:./uk-wage-tax.org][UK Wage and Tax Rates]]),
#+begin_src lisp :results output raw
(let* ((earning-target 19448)
(p-allow 12570)
(taxable-income (- earning-target p-allow))
(tax-to-pay (* taxable-income 0.2))
(total (- earning-target tax-to-pay)))
(format t "- Annual Target Salary: £~a~%" earning-target)
(format t "- Part of Salary which is Taxable: £~a~%" taxable-income)
(format t "- Tax to Pay: £~a~%" tax-to-pay)
(format t "- Salary After Tax: £~a~%" total))
#+end_src
#+RESULTS:
- Annual Target Salary: £19448
- Part of Salary which is Taxable: £6878
- Tax to Pay: £1375.6
- Salary After Tax: £18072.4
| Time Span | Value After Tax (£) | Mean Rent (£) |
|-----------------------+---------------------+---------------|
| Annually | 18072.4 | 1204 |
| Monthly (Before Rent) | 1506.0333 | |
| Monthly (After Rent) | 302.0333 | |
| Weekly (After Rent) | 75.508325 | |
| Daily (After Rent) | 10.786904 | |
#+TBLFM: @3$2=@-1/12::@4$2=@-1-@-2$+1::@5$2=@-1/4::@6$2=@-1/7
After paying rent and Income Tax, it looks like I'll be let with £10.79/day to
spend. Having had a quick look at the
=raw-data/2024-02-18_21-48-46_right-move-manchester.json= file and done a 'bills
inc' search, it looks like most, if not all, the listings include bills. This is
a bit of a relief, but still depressing reading.