This video will go over 𝐥𝐢𝐬𝐭 𝐨𝐟 𝐥𝐢𝐬𝐭 𝐨𝐟 𝐝𝐢𝐜𝐭𝐢𝐨𝐧𝐚𝐫𝐢𝐞𝐬 (nested lists of dictioanries), I will parse 3 different ways for the example with 𝐏𝐲𝐭𝐡𝐨𝐧. There is also, a further nested dictionary with lists as values that will be parse and stored using default dictionary lists (this was tricky). Turn on the 🔔 notification
Join this channel to get access to perks:
𝐀𝐦𝐚𝐳𝐨𝐧 𝐀𝐟𝐟𝐢𝐥𝐢𝐚𝐭𝐞 𝐋𝐢𝐧𝐤𝐬: (I receive a small commission on purchases)
* Prices & Availability Subject to change
Audible Gift Membership: amzn.to/3pAfw7W (End Date: On Going)
Try Audible: amzn.to/3PETRWS (End Date: On Going)
Apple Certified Type C Charger & USB Wall Charger 20W with 2 cables: amzn.to/3dMdqPA
𝐕𝐢𝐝𝐞𝐨𝐬 𝐘𝐨𝐮 𝐌𝐚𝐲 𝐀𝐥𝐬𝐨 𝐋𝐢𝐤𝐞
▶️ PARSE JSON FROM AN API: USING PYTHON: youtu.be/iaN1FxjBuGk
▶️ TWITTER API WITH PYTHON | NLTK BASICS (NLP): youtu.be/kS_0RNdUjrQ
▶️ NLP Using R-Studio: youtu.be/TXhEl7LTQKo
▶️ R-STUDIO: LABEL & ONE HOT ENCODING WITH MULITPLE EXAMPLES: youtu.be/ZTeGJpOSrQo
▶️ REGULAR EXPRESSIONS (Regex) for Parsing ADDRESSES using Python:
Gif at end: Moos-media on pixabay.com
end screen: www.instagram.com/footage3.0/ and on pixabay
Music &. Intro Pic: Special Thanks
Pixabay: instagram (subscribe gif): @imotivationitas
Welcome everybody, you're watching mr fugu data science.
Today, we're parsing a list of lists of dictionaries.
We will solve one problem.
I got this question from a recent viewer and decided to make a video to try to help them.
We need to do a few imports:.
Let's import json, pandas, from collections, import default, dictionary, and chain map.
The data set we're using today comes from a previous example.
I did parsing New York times.
This is some raw data where each value or row;, if we are putting this into a data frame, is a list of dictionaries.
To get a clear picture of what these data look like;.
We will take the first entry and notice that in this list of dictionaries, we have a few different options.
We notice that we have inside of this another list of dictionaries from one of our values.
We will need to parse inside of the media further due to nesting.
Where, we'll notice that there are three url's, formats, heights, and widths.
Within, the media metadata.
For each entry;.
We have 20 entries, okay.
We took for instance, one specific entry and threw it into a data.
It would look something like this.
What we would like to have: is all of these.
In one single row.
So you, you would essentially like to have one list of all the urls, a list of all the formats, the heights, and the widths for each entry.
This was from one entry, not three, okay.
Essentially, what I would like is to create this url:, where we have a list of the three urls.
Assume, that this is the second entry, and this is their three urls and putting it onto one data frame for two rows for this.
I just cut and paste this.
It's the same exact data just to illustrate what I'm trying to achieve later.
You did not want it this way and wanted to expand it.
You could always do the pandas explode.
If that was something you wanted.
Just remember, it gives you more rows.
Let's get into our first coding for the chain map from collections., And.
Think of it as taking multiple dictionaries and basically condensing it into one.
A few useful examples of showing this for the official documentation.
Let's just call this multiple dictionary to single.
Store this as a list and let's iterate in the range of values within the length of data frame.
We would like to parse.
And, take the column of media:, where I'm going to store this as a dictionary and call in the chain, map.
All of the arguments for the data frame that I would like to parse and then you're going to see something kind of funky.
So, I'm using the iterator but.
Then I'm also going through all rows and columns to the end, all right.
Then I'm going to call in my good old trusty list and append a little old friend, and let's see what this looks, like.
This is going to get us most of the way.
This spreads everything out from the media, column.
Mind you, we we're going to concatenate at the end and put everything together.
This is one column that we are parsing.
If you notice, we still have the metadata that we need to take care of later.
We can do this another way and if you notice there are also rows that are completely null that we'll take care.
Let's do a list.
Let's, just call this merge; and I just found this simple little function online on stack, overflow., So, let's iterate inside of this/ And, then we would like to return the dictionary with everything.
Let's do our list comprehension for the items inside of our list.
Now, I'm going to I don't know, I'll, just say..
I'll, just call this: function, (func) with iterator to do a list.
Because we're going to iterate inside of the data frame as usual for the media.
Then, let's make a temp variable and iterate through each piece., Let's append everything.
Let's call that out put it into a data frame and there we are.
We get the same exact thing doing a different way, okay.
What, you could do is check how fast each one of these operations is versus the amount of memory using a profiler and see which is more beneficial out of these two.
If you're curious or more inclined.
Now, let's do a super crazy version.
I would say.
But, don't get scared, it won't bite.
So, let's walk through it, so I need to create a default dictionary, I'm going to create a empty list and I'm going to iterate, like we've been doing through this specific column.
What do I want to do?? I'm going to check if the length of this is greater than zero.
And? What's going on here? Well? Basically, each list can be empty or filled which will equal one, because, if you're looking at lists that we're iterating through some will have something some want.
So if it has something, then we will then iterate inside of it.
Take, our iterator to go through each individual one.
Going inside of a list of lists.
Taking this because, it's now going to be a dictionary converted into tuples using items.
Would be useful for creating our default dictionary.
So? I could just leave a note here, so we could understand that we're converting a dictionary to tuples and iterating.
We would like to append the index value as well as j.
Which is going to be our tuples.
And you'll see why this is important in a second.
Basically we're trying to keep the frame of reference using the index with the tuple.
If it is not like this list, we need to figure out a way to take care of that.
So, we're going to I'll make a note of this in a second.
I'm, going to create a tuple and I'm going to zip together the data frame for the first element and I'm going to take the keys.
I'm trying to do here is anything that didn't fall into this.
I need to create a tuple that has the same keys that are inside of our dictionary and then multiply it by the length or pardon me.
Yeah, and the values are going to be multiplied by the length of the keys, so it matches up.
I want to mention something very specific here:.
If you had dictionaries that had mismatching keys.
This would not be what you would want to do right, here.
I know that for this specific example, it does work.
I'm using it.
It did not.
You just have to do it a little bit, differently.
Okay,! So, what's going on with this? We have our first index value and we have the type the subtype, the caption, copyright, approved, metadata, right.
Then does the same thing for all of these.
Let's skip down to number seven, and this is what I wanted to match up, because when I create the default dictionary, here's my key as the type and then each value.
I want it to line up properly, and if I didn't do this else statement, we would have had an issue.
Let's finish this up and let's create our default dictionary for the first entry in the tuple is the key.
The second entry is our value.
This into a data frame, see what that looks.
But, I made a mistake.
There we go.
This looks like we're on the right track and we did the same exact thing.
Let's scroll, down.
So, we did the same thing., That's, perfect.
Let's, just call this new df, clean this up a little bit.
So it's not ticking up on my page.
We did pretty good here.
I want to note something once again.
You have a list of dictionaries of varying lengths.
You need to do one more step, all right.
Don't forget, that.
We will consider doing something.
Like: basically find the unique keys and then you could perform.
If you wanted a set operator.
So, then you would have all of the keys and then you could do an if else statement to say if it's there put this, if not do that.
So here is basically what was going on with creating our tuple, so it cooperated with the formatting type.
The innermost list of lists for a media metadata is very interesting.
This threw me off and took me some time.
That's why this video is delayed.
Basically, what I'm doing is I'm creating this function, where I have my default dictionary once again.
And, I'm iterating through my data frame column of choice, checking if it's a string.
Basically, if we don't have a list of dictionaries, then it's going to be a string value, saying nope or whatever I declared, or you have some empty value or whatever.
We have to iterate through our dictionary; turn it into tuples again and use our default dictionary and return.
I have to iterate inside of the column.
Do the same type of comparison and then throw in our function has our default dictionary.
I used an else statement: because.
I wanted to keep the formatting.
S entry number 7, 13 and 16, or something like that.
Do not have a list.
They are strings.
We take care of that by creating this zipped tuples for our keys and our values, and I chuck that together.
I create a data frame and expand it.
So for each entry in this: remember there were three urls, three formats, heights, and widths.
Last three columns are always the same.
You have this height, and this particular width for this thumbnail.
You have this format with this height and this width, etc.
It was difficult to distinguish.
What's going on with these, you can see that each one of them is different, all right.
I know the formatting is there.
I have this set up as a list of strings.
So? If you further decide to expand this out and flatten it:, if you needed to you, have that option.
Easily, all right.
So just scroll back up, and I want to mention this.
I decided to print these out.
I can actually do a comparison if you see 10/3 10/3, 10/29 10/29 and you look at them and they're all different because.
They have this different separator for these images or whatever they are, right.
It looks like I did the formatting for everything: properly.
Then we concatenate and put everything together.
The original data, the new data frame with the media, and the expanded url information.
We end up with the same amount of rows.
So that checks out.
Instead, you have 32 columns because it's expanded out now.
So, let's scroll back to the top and look here's everything that worked out from our nested stuff that I wanted to retain as a list.
I kept this just as a reference.
Then these short ones right here came from the original media, all right.
And, then the rest of them are just the original data.
That wasn't too bad of a video.
I'd just like to say:.
If you wanted to speed this up, consider using list comprehensions, vs loops.
Understand that there are times when indexed values forgot, the e.
When index values actually do matter, and you would like to preserve the order.
You need to adjust your code and remind yourself of that.
Finally, people could write elegant code.
All the time.
But, it just comes with experience.
But, do realize that you may see elegant code, but it may not always be the most efficient.
May not always solve the problem.
So, don't get hung up on thinking.
Your code has to look the best.
Just, leave doc, strings and notes.
So people understand what's going on.
Thank you for watching this video.
I hope I brought utility to someone.
Please, like share and subscribe, and if you subscribe turn on that notification, bell.
I'll see you in the next video.
- Initialize an empty list to hold the final result.
- Extract the keys from the first dictionary in the input list and append them as a list to the final result list.
- Extract the values from each dictionary in the input list and append them as a list to the final result list using the zip() function.
- (1) Using a list() function: my_list = list(my_dict.values())
- (2) Using a List Comprehension: my_list = [i for i in my_dict.values()]
- (3) Using For Loop: my_list =  for i in my_dict.values(): my_list.append(i)
You can easily convert a Python dictionary to a string using the str() function. The str() function takes an object (since everything in Python is an object) as an input parameter and returns a string variant of that object.How do I convert a list of lists to a single list? ›
- List comprehension [x for l in lst for x in l] assuming you have a list of lists lst .
- Unpacking [*lst, *lst] assuming you have a list of two lists lst .
- Using the extend() method of Python lists to extend all lists in the list of lists.
Python provides an option of creating a list within a list. If put simply, it is a nested list but with one or more lists inside as an element. Here, [a,b], [c,d], and [e,f] are separate lists which are passed as elements to make a new list.How do I create a list from multiple dictionaries in Python? ›
We can append a new dictionary to the list of dictionaries by using the Python append() method. Example: In this example, we have a list of a single dictionary element. We will add another diction to this list using append() method.How to convert list using dictionary in Python? ›
Converting a list to a dictionary can be done by using the list of key method. Create two lists, one for keys and one for values. Zip the two lists together using the zip() function to create a list of key-value pairs. Convert the list of key-value pairs to a dictionary using the dict() function.How do you process a list of dictionaries in Python? ›
- #initialising a dictionary inside a list.
- #displaying the list.
- print("The dictionary inside list is: ",list_val)
- #checking the type.
- print("The type of list_val is: ",type(list_val))
Step 1: Import the Pandas Module. Step 2: Convert the list of dictionaries to a Pandas dataframe. A Pandas dataframe is like a 2D array or a table with rows and columns. Step 3: Now, convert the pandas dataframe to a CSV file using the to_csv function.How to create a list of dictionaries in Python using for loop? ›
- To iterate all over dict items you'll need fruits.items()
- To build a list repeating each key N times do: [key]*N.
- As dict values indicate how many times to repeat, do: [key]*value.
Use from_dict(), from_records(), json_normalize() methods to convert list of dictionaries (dict) to pandas DataFrame. Dict is a type in python to hold key-value pairs. Key is used as a column name and value is used for column value when we convert dict to DataFrame.How to convert list of strings to list of lists in Python? ›
Create an empty list x to store the result. Iterate over each element item and its index i in test_list using enumerate(). Use the eval() function to evaluate the string item as a Python expression and convert it into a list.How do you combine two lists into a list of lists in Python? ›
Method 1: Using nested for loop
Create a variable to store the input list of lists(nested list). Use the append() function(adds the element to the list at the end) to add this element to the result list. Printing the resultant list after joining the input list of lists.
chain() function combines the multiple sublists into a single iterable object, and then the list() function converts the iterable object into a single list.How do I make multiple lists into one list in Python? ›
Python's extend() method can be used to concatenate two lists in Python. The extend() function does iterate over the passed parameter and adds the item to the list thus, extending the list in a linear fashion. All the elements of the list2 get appended to list1 and thus the list1 gets updated and results as output.How do you convert a dictionary to a list in Python? ›
- # Convert to a list of tuples (keys and values) items_list = list(my_dict.items()) print(items_list) # Output: [('a', 1), ('b', 2), ('c', 3)]
- items_list = [[k,v] for k,v in my_dict.items()] print(items_list) # Output: [['a', 1], ['b', 2], ['c', 3]]
Use pd. DataFrame. from_dict() to transform a list of dictionaries to pandas DatFrame. This function is used to construct DataFrame from dict of array-like or dicts.How do you turn a list of dictionaries into a list of tuples? ›
- Import the chain() function from the itertools module.
- Use the chain() function to create a single iterable from the dictionaries in the ini_list.
- Use a list comprehension to create a list of tuples by combining each key and value pair with the + operator.
- Flatten the list of tuples.
Using a for loop method
One way to convert a list to a dictionary python is to use a for loop to iterate through the list and create dictionary from list python and from its elements. In the above code, we create an empty dictionary my_dict , then loop through the list my_list Applying the enumerate() function.