This replaces the original way of doing it with a dictionary. The
change was brought about because the previous data loading function
was omitting duplicate entries (I.E. multiple readings from the same
second in time).
I made the mistake of not realising the dictionary was 'removing'
duplicate entries by simply not adding the 'second' reading for a
given time interval. An example of this is as follows,
there are two readings within the fifty-sixth second (E.G. 07:03:56)
but the second was being omitted from the dictionary storing the data
after it was loaded into memory.
I changed the return type to a list of tuples to preserve the raw part
of the data (I.E. multiple readings per second). The intention here is
so I can start from the 'raw' data without needing to load the data in
numerous times during run-time. I've omitted the 'Id' column because I
have no need for it in this context. If I do need it, though, I can
add an extra item to the returned tuple (I.E. add r[0] to append) .
This bug came about because I took most of the code from the initial
'load data' function. The original function converted the raw CSV data
into a dictionary which tallied the total readings per second before
returning it. This function doesn't do that. It leaves the data in a
more raw state.
I, also, removed the load_data function. This is part of the gradual
move to transfer 'service' based functions out of flicker.py. The aim
here is to reduce the need for duplicating code or make it easier to
make function calls when needing a particular piece of data (or
transformation of data).
Houses a collection of print-based functions to help relay information
in the terminal. The biggest motivation for this was to make the
dictionaries easier to read when printing them in the terminal.
I've began getting into a mess with trying to use duplicated code and
data. I've began to move functions to their own services folder and
files to help reduce the duplication.
Because I'm only testing and getting a proof-of-concept up and
running, the longer and more specific files is not worth the effort
right now. They are unwieldy to type and getting in the way. I've
renamed them to smaller names with the intention to use more specific
file names when things start to settle in the direction of the
project.
Just basic organisation. I accidently created flicker.py at the
repositories root. I meant to create it in 'src'. I'd already pushed
the commit before I spotted my mistake hence the 'mess' in the git
history.
The code in this file is mostly to get the ball moving in this
repository. This file opens the
'light-meter-sample-readings-23-04-2021-ritherdon.csv' file and
tallies-up the number of requests per second groups. After that, it
writes the results to 23-04-2021-readings-per-sec.csv.
As an example (to help explain), it counts and stores how many times
the light meter (factory1) took a reading at the rate of two times per
second. In this instance, there were 2955 instances of the light meter
taking two readings per second. This roughly equates to 10% of the
days readings was at a rate of two requests per second. 28% of the
time the light meter was recording at four requests per second, 18%
for one and 44% for three.
The main file I was using was too big and taking too long to
process. Whilst I got things moving in this repository, I removed a
lot of the data to help speed-up the development process. This should
be a temporary file and I expect this to be deleted at some point in
the future.
This is a .csv file and will be used and the initial test data this
project will aim to breakdown into a list of readings for each
second. At the moment, I'm being vague with my desciption of what
'breakdown' means here because this is just a proof-of-concept. I
don't expect this file to remain around for too long in the
repository.