1
0
Fork 0
Data analysis on the light meter readings taken with the Light Meter project. The main area of study is the health and safety concerns regarding epilepsy.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
This repo is archived. You can view files and clone it, but cannot push or open issues/pull-requests.

73 lines
3.2 KiB

Squashed commit of the following: commit 3c15bf541ea21f5ab712bc6a871e99e8abe75304 Author: Craig Oates <craig@craigoates.net> Date: Sun May 9 00:26:07 2021 +0100 change save_filtered_flickers to use writerow. It was using writewrows before this but writerow makes it easier to change. This is probably a nothing change but hey-ho! commit c73f5b2d2542d7bfa332ab70d6323e79f93cae2f Author: Craig Oates <craig@craigoates.net> Date: Sun May 9 00:19:07 2021 +0100 add fix to stop duplicating entries in find_readings_with_lights_on. The function was originally looping through the set of readings for a particular time-stamp. If there was more than one reading above 39, it would append that particular time-stamp every time the if-statement was true -- as it looped through each reading for said time-stamp. This change adds a break and a variable to track if the time-stamp should be added to the list -- after it has broke out of the if-block. commit f35fdd611a182177e3c415fd86e83cab553e147c Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 21:27:54 2021 +0100 add comments to flicker.py explaining process. These comments explain how each 'section' works with the data. There mostly here for when I come back to this months/years from now and I've forgotten how this code works. The other scenarios this is for -- although very unlikely -- is other people new to the project and need a helping hand. commit 2e602e9082ffd60a86bf43eb0146d7715e02cf2d Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 21:01:25 2021 +0100 filter readings (over herz and light levels threshold). Having filtered down the list to readings which suppass the Hertz threshold (4+ per-second at time of writing), the code here filters it down even more. This bit of code searches for readings within this already filtered list for any readings which activate the light in the gallery (with the threshold matching that of 'gallery1' which in anything over 39). It then proceeds to save the results. commit 80cf014c21ea3c5393aca10cbfa785e398d11daf Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 20:56:40 2021 +0100 add save function for filtered flickers list. This could do with being generalised with the other save functions (in io_services). For now, it's job is to save the list of readings which contain four or more readings-per-second and at least one of them is over 39. The specificness of the 39 is because the test data used is from 'factory1' (Light Meter) and that is the threshold for triggering the lights connected to 'gallery1' in the gallery. commit 7bfd06cf0380d0f26ff4cc9ccd4607009b0e353f Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 19:27:03 2021 +0100 implement find_flicker feature. This functions goes through the list of readings and forms a dictionary of time-stamps, with light readings beyond a specified readings-per-second threshold, and the reading for said time-stamp. The results are then saved to the specified file, using 'save_rps_totals' function. commit 2cbecc0d2c991a75c8735662056a0be540de2e25 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 17:25:06 2021 +0100 add save function for time-stamps with readings above threshold. This is not exactly encoded into the code itself. It's implied and needs to be used with that in mind. It's more of 'save list' function. I will probably rename/refactor this in the future depending on how the project develops. commit 263bfe40fd8786e20fa9e1b0d86638f7b4e25317 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 17:22:44 2021 +0100 move 'data service' files to data_services file. Part of a move to clean-up flicker.py -- making it a place to call the functions in a way which is easy to 'switch' functions off via comments. commit ee0853b3dd2109a6cc74705d869d9b8f89115e3b Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:37:50 2021 +0100 create data_services.py Part of the move to clean-up flicker.py and make the code more modular (for potential improve in REPL usage?). commit 0f6fdc22facb64a9005b7b5eaf879ed891d860ef Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:35:50 2021 +0100 add readings-per-sec.csv file to gitignore. It was getting in the way and doesn't need to be part of the commit history. It's only purpose is to output the results -- to be used else where. commit d9742dc43b22227c0db7b69f5a8eb5167a7c4d19 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:30:54 2021 +0100 remove redundant (save) code. commit 9fb282db1fa3e783176a8ef2e753d14bd5f2e150 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:30:07 2021 +0100 move save_rps_totals function to io_services. Part of code clean-up. commit 26b01f38f513458169c765e0399a58f819762262 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:16:43 2021 +0100 tally readings per second using list of tuples. This replaces the original way of doing it with a dictionary. The change was brought about because the previous data loading function was omitting duplicate entries (I.E. multiple readings from the same second in time). commit a4f6142137efb959851f6a783bef3ad779ca28cd Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:15:49 2021 +0100 fix string interpolation bug in print_list function. commit b149ce3a8d393ee6f88c83d123e90a7ea7bedb9e Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 16:02:14 2021 +0100 change return type to list in load_raw_data. I made the mistake of not realising the dictionary was 'removing' duplicate entries by simply not adding the 'second' reading for a given time interval. An example of this is as follows, there are two readings within the fifty-sixth second (E.G. 07:03:56) but the second was being omitted from the dictionary storing the data after it was loaded into memory. I changed the return type to a list of tuples to preserve the raw part of the data (I.E. multiple readings per second). The intention here is so I can start from the 'raw' data without needing to load the data in numerous times during run-time. I've omitted the 'Id' column because I have no need for it in this context. If I do need it, though, I can add an extra item to the returned tuple (I.E. add r[0] to append) . This bug came about because I took most of the code from the initial 'load data' function. The original function converted the raw CSV data into a dictionary which tallied the total readings per second before returning it. This function doesn't do that. It leaves the data in a more raw state. commit 8f9df9462b70b0fde1fac4edf6e812409929841d Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 15:13:22 2021 +0100 import io_services into flicker.py. I, also, removed the load_data function. This is part of the gradual move to transfer 'service' based functions out of flicker.py. The aim here is to reduce the need for duplicating code or make it easier to make function calls when needing a particular piece of data (or transformation of data). commit 18d0470d433b9d6256c9c683711e5c723923517f Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 15:11:32 2021 +0100 create log_services.py. Houses a collection of print-based functions to help relay information in the terminal. The biggest motivation for this was to make the dictionaries easier to read when printing them in the terminal. commit 47636b3b18400bf2f4ef2c44f4549534af50eb17 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 15:02:07 2021 +0100 create io_services.py. The inital services files. This provides a function to load the CSV file of the raw test data (factory1 Light Meter). commit 25277572c008d35663e4e88474cd95295005b3fe Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 15:00:40 2021 +0100 begin moving code to 'services'. I've began getting into a mess with trying to use duplicated code and data. I've began to move functions to their own services folder and files to help reduce the duplication. commit b93a207ba3e3511c42a3daae6928e4196dc86f56 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 13:13:40 2021 +0100 rename reading-per-sec-tallies.csv to readings-per-sec.csv. Did this because for ease of typing. commit 7c9feab1b247220a0fe4d55ff0fba6d79b89e5af Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 13:11:52 2021 +0100 put code into functions and 'main' in flicker.py. This is just so I can start to 'turn things on and off' (via comments). commit 51f0d3e4bf5a0b2139caddccc8f0aa1b73a577a6 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 12:52:41 2021 +0100 update file names to new test-data file names. This is part of a previous commit which shortended the file housing the test data. commit f0cbce52f612230607a63c877d25a9e1fe81467c Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 12:48:40 2021 +0100 rename test data files. Because I'm only testing and getting a proof-of-concept up and running, the longer and more specific files is not worth the effort right now. They are unwieldy to type and getting in the way. I've renamed them to smaller names with the intention to use more specific file names when things start to settle in the direction of the project. commit 9b87d742df13618886834c0ffbbe57ff616f495f Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 00:18:10 2021 +0100 create 23-04-2021-readings-per-sec.csv ('results' file). This file houses the data flicker.py computes/generates. commit e04617ae137cee6320e8e64e7e2c9d1e3b344e0b Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 00:13:45 2021 +0100 delete original flicker.py (moved to src). Just basic organisation. I accidently created flicker.py at the repositories root. I meant to create it in 'src'. I'd already pushed the commit before I spotted my mistake hence the 'mess' in the git history. commit dc05b3481d47e506bb02d61e8e9499c15f490139 Author: Craig Oates <craig@craigoates.net> Date: Sat May 8 00:01:37 2021 +0100 create flicker.py and csv parsing code. The code in this file is mostly to get the ball moving in this repository. This file opens the 'light-meter-sample-readings-23-04-2021-ritherdon.csv' file and tallies-up the number of requests per second groups. After that, it writes the results to 23-04-2021-readings-per-sec.csv. As an example (to help explain), it counts and stores how many times the light meter (factory1) took a reading at the rate of two times per second. In this instance, there were 2955 instances of the light meter taking two readings per second. This roughly equates to 10% of the days readings was at a rate of two requests per second. 28% of the time the light meter was recording at four requests per second, 18% for one and 44% for three. commit 77ee5318eca7633d424a0d4debc49120a38db149 Author: Craig Oates <craig@craigoates.net> Date: Fri May 7 23:59:10 2021 +0100 create 'lite' sample data set. The main file I was using was too big and taking too long to process. Whilst I got things moving in this repository, I removed a lot of the data to help speed-up the development process. This should be a temporary file and I expect this to be deleted at some point in the future. commit aeaf2dc258ead773802bc1fe9fec5f1fe3a46d74 Author: Craig Oates <craig@craigoates.net> Date: Fri May 7 20:27:02 2021 +0100 create flicker.py. This is the 'main' file if you will. commit 734097fcc9f758fb363fd2885beb6a8e850e9c78 Author: Craig Oates <craig@craigoates.net> Date: Fri May 7 20:17:24 2021 +0100 import initial light meter readings 23-04-2021. This is a .csv file and will be used and the initial test data this project will aim to breakdown into a list of readings for each second. At the moment, I'm being vague with my desciption of what 'breakdown' means here because this is just a proof-of-concept. I don't expect this file to remain around for too long in the repository. commit 00783700937bd4f53cd90fdd70b61528e943dbd1 Author: Craig Oates <craig@craigoates.net> Date: Fri May 7 20:16:01 2021 +0100 create requirements.txt.
3 years ago
from services import io_services, log_services, data_services
def main():
# File paths used to save and load the data.
raw_data_path = "data/test-data.csv"
rps_save_path = "data/results/readings-per-sec.csv"
rps_above_thresh = "data/results/readings_above_threshold.csv"
flicker_list = "data/results/flicker_list.csv"
filtered_flickers = "data/results/filtered_flicker_entries.csv"
# Step 1
# ======
# Load the raw data, taken from the exported database.
raw_data = io_services.load_raw_data(raw_data_path)
# log_services.print_list(raw_data)
# Step 2
# ======
# Tally-up how many readings occured for each second in the
# raw data. For example, for the period between 2021-04-23
# 07:03:57 and 2021-04-23 07:03:58, how many readings did the
# system take? Was it 1, 3, Etc.
time_tallies = data_services.tally_readings_per_second(raw_data)
# log_services.print_dictionary(time_tallies)
# Step 3
# ======
# Count the number of tallies derived from step 2. So, how
# many time did the system take 2 readings-per-second, how many
# times did the system 3 readings-per-second Etc.
rps_totals = data_services.total_count_for_each_reading_per_second(time_tallies)
# log_services.print_dictionary(rps_totals)
io_services.save_rps_totals(rps_totals, rps_save_path)
# Step 4
# ======
# List out all the time periods which had more than the
# specified readings-per-second. The default value is any time
# period with more than 4 readings-per-second but you can change
# it (it's a function argument).
rps_above_hertz = data_services.get_rps_above(4, time_tallies)
# log_services.print_list(rps_above_two)
io_services.save_rps_above_threshold(rps_above_hertz, rps_above_thresh)
# Step 5
# ======
# Using the list of time periods derived in step 4
# (rps_above_hertz), create a new list from the raw data (step
# 1). Filter out the time periods created in step 4 and include
# all their readings for that time period (I.E. second). An
# example of this is: The list created in step 4 shows the time
# period between 2021-04-23 07:03:57 and 2021-04-23 07:03:58 has 4
# readings so note the (start) time and how the readings recorded
# in that period.
flicker_entries = data_services.find_flickers(rps_above_hertz, raw_data)
# log_services.print_dictionary(flicker_tallies)
io_services.save_rps_totals(flicker_entries, flicker_list)
# Step 6
# =======
# This step filters out the time periods which pass the hertz
# threshold to those which have at least one reading in its (1
# second period) grouping to cause the light to turn on in the
# gallery ('gallery1'). If this list contains any readings, you
# can review them to make sure the light doesn't turn on and off
# enough times to potentially cause a photo-epileptic seizure.
filtered_flicker_entries = data_services.find_readings_with_lights_on(flicker_entries)
# log_services.print_list(filtered_flicker_entries)
io_services.save_filtered_flickers(filtered_flicker_entries, filtered_flickers)
if __name__ == "__main__":
main()