#+title: Overhaul 2024 Journal #+options: ':nil *:t -:t ::t <:t H:3 \n:nil ^:t arch:headline author:t #+options: broken-links:nil c:nil creator:nil d:(not "LOGBOOK") date:t e:t #+options: email:nil expand-links:t f:t inline:t num:t p:nil pri:nil prop:nil #+options: stat:t tags:t tasks:t tex:t timestamp:t title:t toc:t todo:t |:t #+date: \today #+author: Craig Oates #+email: craig@craigoates.net #+language: en #+select_tags: export #+exclude_tags: noexport #+creator: Emacs 29.1.90 (Org mode 9.7-pre) #+cite_export: * Setup Directories This just sets up the work environments directories. You should only need to run this the first time you work your way thought this file. #+begin_src shell :results output mkdir raw-data renders working-data #+end_src * Setup Python Virtual Environment (Properties Drawer) :PROPERTIES: :header-args:shell: :exports code :header-args:elisp: :exports code :header-args:ipython: :exports both :header-args: :eval never-export :END: I've got some code [[https://git.abbether.net/craig.oates/literate-charting/src/branch/master/charting.org][here]] which has information about setting up Python to be used in a literate programming context with org-mode. #+begin_src shell python3 -m venv venv # To activate the virtual environment in your CLI... source venv/bin/activate #+end_src #+begin_src shell # Should only need this on first run. touch requirements.txt #+end_src #+begin_src shell # To activate the virtual environment in your CLI... source venv/bin/activate #+end_src #+begin_src shell :results code # To check results. tree -L 1 #+end_src #+begin_src elisp :results silent ;; Activate at start of coding session... ;; Change path to where you have saved this repository. (pyvenv-activate "~/dev-shed/overhaul2024/venv") (pyvenv-mode) #+end_src You can, also, do this via =m-x pyvenv-activate=. You should be able to put ~venv~ in the prompt and be good to go. Use =m-x pyvenv-mode= to see the currently active virtual-environment in the mode-line. #+begin_src shell :results silent # Snippets to install and store Python packages (in venv). # You'll probably be better running these at the CLI. pip freeze > requirements.txt pip install -r requirements.txt #+end_src * Scrape Right Move Data Manchester (2024-02-18 Sun) Found some code at [[https://scrapfly.io/blog/how-to-scrape-rightmove/#full-scraper-code][ScrapFly]] for scraping data from [[https://www.rightmove.co.uk][Right Move]]. I put it back together (in was in code blocks for the purpose of the blog post) and modified it to what I need. Ideally, I would have the code within this file but the code is already written and it easier to just run the code as a script file (from here). *You will need to change the ~location~ variable in the ~run~ function to search and scrape data from other parts of the country.* The data is stored in =raw-data/=, as a JSON file, and will require further parsing and processing. #+begin_src shell :results silent python rightmove.py #+end_src * Setup Common Lisp Environment *Run ~m-x slime~ before running the following code.* And, make note of the ~:session~ attribute. It allows you to use the code in the code block to be use in other code blocks which also use the ~:session~ attribute. #+begin_src lisp :session :results silent (ql:quickload :com.inuoe.jzon) (ql:quickload :plot/vega) (ql:quickload :lisp-stat) (ql:quickload :data-frame) (ql:quickload :str) #+end_src * Convert Right Move JSON Data (2024-02-18 Sun) for Manchester to CSV File This code goes through the JSON file, returned by Right Move, and creates a CSV file of the data I'm most interested in. It's, also, prep. work for using Lisp Stat. #+begin_src lisp :results silent :session ;; Adjust path to match the file you want to process. (let ((data (com.inuoe.jzon:parse #P"raw-data/2024-02-18_21-48-46_right-move-manchester.json"))) (with-open-file (stream #P"working-data/2024-02-18_21-48-46_right-move-manchester.csv" :direction :output :if-exists :supersede) (format stream "ID, Price, Frequency, Bedrooms, Bathrooms, Display Address, Students, Latitude, Longitude, URL~%") (loop for homes across data do (format stream "~a, ~a, ~a, ~a, ~a, ~s, ~a, ~a, ~a, https://www.rightmove.co.uk/properties/~a~%" (gethash "id" homes) (gethash "amount" (gethash "price" homes)) (gethash "frequency" (gethash "price" homes)) (gethash "bedrooms" homes) (gethash "bathrooms" homes) (gethash "displayAddress" homes) (gethash "students" homes) (gethash "latitude" (gethash "location" homes)) (gethash "longitude" (gethash "location" homes)) (gethash "id" homes))))) #+end_src * Explore CSV Data for Right Move Manchester (2024-02-18) #+begin_src lisp :results silent :session (defvar *rm-manchester* (lisp-stat:read-csv #P"working-data/2024-02-18_21-48-46_right-move-manchester.csv") "Data from the CSV file (after it has been processed from the JSON file).") (lisp-stat:defdf *rm-manc-df* *rm-manchester*) #+end_src Having had a quick look at the CSV file, I noticed some rent prices above £10,000 a month. This is just way outside my budget, so I could do with filtering them out. There are, also, some entries which are car parks, so I'll need to filter them out as well. I could, also, do with separating out the weekly and monthly rent prices so I look at them fairly. Filtering out the student accommodation is something I should do, as well. I'm not a student anymore, and haven't been for quite a long time. #+begin_src lisp :session :results silent (lisp-stat:write-csv (lisp-stat:filter-rows *rm-manc-df* '(and (string= "weekly" frequency) (string= "NIL" students))) #P"working-data/2024-02-18_21-48-46_right-move-manchester-weekly.csv" :add-first-row t) #+end_src #+begin_src lisp :session :results silent (lisp-stat:write-csv (lisp-stat:filter-rows *rm-manc-df* '(and (string= "monthly" frequency) (> 1500 price) (string= "NIL" students) ;; Filtering for the car park entries. (< 275 price) (not (str:contains? "car " display-address :ignore-case t)) (not (str:contains? "park " display-address :ignore-case t)) (not (str:contains? "parking " display-address :ignore-case t)))) #P"working-data/2024-02-18_21-48-46_right-move-manchester-monthly.csv" :add-first-row t) #+end_src I've stored the /cleaned/ results in separate CSV files because I will need to work through them away from here – in terms of reviewing the listing on the Right Move website. Having said that, I am going to do a little bit of /summary/ work so I can see what I need to aim for salary-wise if I’m to get a new job. #+begin_src lisp :session :results silent (lisp-stat:defdf *rm-manc-filt-df* (lisp-stat:read-csv #P"working-data/2024-02-18_21-48-46_right-move-manchester-monthly.csv")) #+end_src #+begin_src lisp :session :results drawer (lisp-stat:summarize-column '*rm-manc-filt-df*:price) #+end_src #+RESULTS: :results: 121 reals, min=945, q25=1098.75, q50=1246.4286, q75=1358.9375, max=1495 :end: - Monthly Average: £1236.12 (done in CSV file outside of this file) The /weekly/ version of the Right Move CSV file only has six entries. I'm not going to bother using Lisp-Stat here. I'm just going to summarise manually. | Weekly (£) | Monthly (£) | Notes | |------------+-------------+---------------| | 348 | 1392 | | | 238 | 952 | | | 592 | 2368 | Out of budget | | 458 | 1832 | Out of budget | | 458 | 1832 | Out of budget | | 443 | 1772 | Out of budget | #+TBLFM: @2$2=4*$1::@3$2=4*$1::@4$2=4*$1::@5$2=4*$1::@6$2=4*$1::@7$2=4*$1 Looks like I'm keeping only two of the weekly entries. Deleted the 'Out of budget' entries from =working-data/2024-02-18_21-48-46_right-move-manchester-weekly.csv= file. Combining the monthly and week averages to get the average rent price for places to live in Manchester, via Right Move, #+begin_src lisp :results output raw (let* ((weekly-avg (/ (+ 1392 952) 2)) (wk-mnt-avg (/ (+ 1236 weekly-avg) 2))) (format t "Weekly rent average with Right Move (converted to monthly): £~a~%" weekly-avg) (format t "OVERALL monthly rent for Manchester with Right Move: £~a~%" wk-mnt-avg)) #+end_src - Weekly rent average with Right Move (converted to monthly): £1172 - OVERALL monthly rent for Manchester with Right Move: £1204 So, to sum up the properties on Right Move, | Source | Min. (£) | Max. (£) | Avg. (£) | |------------+----------+----------+----------| | Right Move | 945 | 1495 | 1204 | I formatted the =Annual= values manually. Kept the table formulas for reference. | Quantity Property | Annual | |-------------------+---------| | Minimum | £11,340 | | Maximum | £17,940 | | Average | £14,448 | #+TBLFM: @2$2=945*12::@3$2=1495*12::@4$2=1204*12 * Summary of Right Move Data I basically need to earn around £15,000/yr just to cover my living costs. This does not include food, clothes, travel and socialising. Realistically, I'm looking at aiming for £20,000/yr if I go this route. To get the bar chart created below, I manually add a column called ~ROW_ID~ and assigned consecutive integers to the =working-data/2024-02-18_21-48-46_right-move-manchester-monthly.csv= file. The chart doesn't include the /weekly/ entries. There are only two entries which applied here and they don't affect the average in a significant way. Including the weekly rates, the average goes from £1236.12 to £1204 (difference of £32/month). #+begin_src lisp :session :results file (vega:defplot monthly-rent `(:title "Advertised Monthly Rent Rates for Manchester (18/02/2024) from Right Move (Filtered High & Low Entries)" :description "Some text" :data ,*rm-manc-filt-df* :width 1600 :height 900 :layer #((:mark (:type :bar :fill "steelblue") :encoding (:x (:field :ROW-ID :title "Assigned Id.") :y (:field :PRICE :title "Rent (£)" :type :quantitative) :legend (:direction :vertical) :tooltip (:field :display-address))) (:mark (:type :rule :color "darkorange" :size 3) :encoding (:y (:field :PRICE :type :quantitative :aggregate :average) :tooltip (:field :PRICE :type :quantitative :aggregate :average)))))) (vega:write-html monthly-rent #P"renders/2024-02-18_21-48-46_right-move-manchester-monthly.html") #+end_src #+RESULTS: [[file:renders/2024-02-18_21-48-46_right-move-manchester-monthly.html]] #+attr_latex: :width 600 #+attr_html: :width 800 [[file:renders/2024-02-18_21-48-46_right-move-manchester-monthly.png]] * Add Tax and Daily Rates Breakdown This is the first data-set I worked on, and since then I've written more code to help parse and summarise the data. So, I'm going to add in here what I've been doing in the other Manchester files. #+begin_src calc :results output 14448 + 5000 #+end_src #+RESULTS: : 19448 Using £19,448 as the starting point (see [[file:./uk-wage-tax.org][UK Wage and Tax Rates]]), #+begin_src lisp :results output raw (let* ((earning-target 19448) (p-allow 12570) (taxable-income (- earning-target p-allow)) (tax-to-pay (* taxable-income 0.2)) (total (- earning-target tax-to-pay))) (format t "- Annual Target Salary: £~a~%" earning-target) (format t "- Part of Salary which is Taxable: £~a~%" taxable-income) (format t "- Tax to Pay: £~a~%" tax-to-pay) (format t "- Salary After Tax: £~a~%" total)) #+end_src #+RESULTS: - Annual Target Salary: £19448 - Part of Salary which is Taxable: £6878 - Tax to Pay: £1375.6 - Salary After Tax: £18072.4 | Time Span | Value After Tax (£) | Mean Rent (£) | |-----------------------+---------------------+---------------| | Annually | 18072.4 | 1204 | | Monthly (Before Rent) | 1506.0333 | | | Monthly (After Rent) | 302.0333 | | | Weekly (After Rent) | 75.508325 | | | Daily (After Rent) | 10.786904 | | #+TBLFM: @3$2=@-1/12::@4$2=@-1-@-2$+1::@5$2=@-1/4::@6$2=@-1/7 After paying rent and Income Tax, it looks like I'll be let with £10.79/day to spend. Having had a quick look at the =raw-data/2024-02-18_21-48-46_right-move-manchester.json= file and done a 'bills inc' search, it looks like most, if not all, the listings include bills. This is a bit of a relief, but still depressing reading.