Programming Basics and Data Analytics with Python|Deepak|25th Jul - 29 th Aug

Discussion in 'Big Data and Analytics' started by Kunal Guwalani, Jul 26, 2020.

  1. Kunal Guwalani

    Kunal Guwalani Well-Known Member
    Simplilearn Support

    Joined:
    Jul 17, 2018
    Messages:
    201
    Likes Received:
    22
    #1
    Niharika N Gupta likes this.
  2. Maja Nikolova

    Maja Nikolova Member

    Joined:
    Jan 31, 2020
    Messages:
    4
    Likes Received:
    0
    Hi All,

    can somebody please help me on the following:

    I am trying to create a boxplot for a the "Price" field for the App Prediction Project, which is requested for the univariate analysis.

    I used the same code as given in the course, but no plot is displayed:

    import matplotlib.pyplot as plt
    import seaborn as sns
    sns.boxplot( y="Price", data=pd.melt(df))
    plt.plot('Price')

    I tried this option:
    fig1, ax1 = plt.subplots()
    ax1.set_title('Price')
    ax1.boxplot('Price')

    Moreover, for the histogram the code works: sns.distplot(df['Rating'],bins=100), but for the boxplot not.

    Can please someone help me?
    I was googling a lot and reviewing the course and used the same code but still the plot is not displayed at all in spyder (anaconda).

    Many thanks in advance!
    Maja
     
    #2
  3. Ganesan_11

    Ganesan_11 New Member
    Alumni

    Joined:
    Nov 11, 2019
    Messages:
    1
    Likes Received:
    1
    Hi
    What is the error you are getting while run the boxplot?
    Try this below code
    sns.boxplot(Dataframename['Price'])
     
    #3
    Maja Nikolova likes this.
  4. Maja Nikolova

    Maja Nikolova Member

    Joined:
    Jan 31, 2020
    Messages:
    4
    Likes Received:
    0

    Thanks for the help!
    Now the boxplot is displayed but without the box, could you please also let me know if it is correctly displayed? Since as I know from the course, the boxplot graph does not look like it is in my case, please see the screen shot below.
     

    Attached Files:

    #4
  5. _78376

    _78376 New Member

    Joined:
    May 7, 2020
    Messages:
    1
    Likes Received:
    0
    When we get back Deepak sir?
     
    #5
  6. Chandanam Harish

    Chandanam Harish New Member

    Joined:
    Jun 22, 2019
    Messages:
    1
    Likes Received:
    0
    That is due to outliers in the 'Price', this means the difference between the minimum and maximum values in Price column is relatively large and the displayed output means that a lot of the values are close to zero which is why you are able to see only a line rather than a box.
     
    #6
  7. Aayushi_6

    Aayushi_6 Well-Known Member

    Joined:
    Sep 19, 2016
    Messages:
    201
    Likes Received:
    26
    Hi Guys,

    I have uploaded the day3 class materials on the drive. Please check.
     
    #7
  8. Kritika Aggarwal_1

    Kritika Aggarwal_1 New Member

    Joined:
    Jul 10, 2020
    Messages:
    1
    Likes Received:
    0
    thank you so much Aayushi . the material is very good
     
    #8
  9. _84050

    _84050 New Member

    Joined:
    Jun 12, 2020
    Messages:
    1
    Likes Received:
    0
    #9
  10. _79969

    _79969 Member

    Joined:
    May 25, 2020
    Messages:
    2
    Likes Received:
    0
    Hi,
    What is wrong in below program?
    When I input the integers as 23,43 & 9, I get output as 9 and not 43.
    Is max function not supposed to provide the largest number?

    #Python program to find the largest number among the three input numbers
    print ("Enter 3 integers")
    num = [input(),input(),input()]
    print ("Maximum of ", num[0],num[1],num[2]," is :",max(num))
     
    #10
  11. Anurag Talati

    Anurag Talati Member

    Joined:
    Jul 17, 2020
    Messages:
    3
    Likes Received:
    0
    hello Ayushi Ma'am
    Where can we find Python assignment you were going to put on drive? I am unable to find the short assignment you were supposed to put
     
    #11
  12. Anurag Talati

    Anurag Talati Member

    Joined:
    Jul 17, 2020
    Messages:
    3
    Likes Received:
    0
    Hello Ma'am, where is short assignment on python problems you were going to provide for week practice? I am unable to find it on drive
     
    #12
  13. _79969

    _79969 Member

    Joined:
    May 25, 2020
    Messages:
    2
    Likes Received:
    0
    Hi, It is there in Day 3 folder
     
    #13
  14. _86604

    _86604 Member

    Joined:
    Jun 29, 2020
    Messages:
    3
    Likes Received:
    0
    Hi Ayushi
    How to submit Assignment? In drive or uploading in community?
     
    #14
  15. Aayushi_6

    Aayushi_6 Well-Known Member

    Joined:
    Sep 19, 2016
    Messages:
    201
    Likes Received:
    26
    Attaching here as well. (Python ASSIGNMENT)
     

    Attached Files:

    #15
  16. _85970

    _85970 New Member

    Joined:
    Jun 26, 2020
    Messages:
    1
    Likes Received:
    0
    Hi Aayushi,

    Find the attached Assignment 1 Answers.

    Thanks,
    Sathyanarayanan.C
     

    Attached Files:

    #16
  17. _86604

    _86604 Member

    Joined:
    Jun 29, 2020
    Messages:
    3
    Likes Received:
    0
    Where to submit the Assignment? In same community link or mail....
     
    #17
  18. _86604

    _86604 Member

    Joined:
    Jun 29, 2020
    Messages:
    3
    Likes Received:
    0
    Hi Aayushi,
    Please find attached Assignment_Answer

    Thanks,
    Cilambarasan
     

    Attached Files:

    #18
  19. _85192

    _85192 Member

    Joined:
    Jun 22, 2020
    Messages:
    2
    Likes Received:
    0
    Hi Aayushi,

    for the assignment divisible by 7 and not divisible by 5 I have used following logic:

    list=[x for x in range(2000,3200) for questionaet,remainder in enumerate(divmod(x,7)) if remainder ==0]
    bridge=[x for x in list for quo,rem in enumerate(divmod(x,5)) if rem !=0]
    print(bridge,end=" ")

    I am not sure what have I done wrong since the out put of Bridge which should have been all numbers divisble by 7 but not divisible by 5. prints divisible by 7 numbers twice and random not divisible 5 numbers also are printed. e.g. 2002 is printed twice and 2065 must not have been printed but this is also printed.

    Output I get is as below:
    [2002, 2002, 2009, 2009, 2016, 2016, 2023, 2023, 2030, 2037, 2037, 2044, 2044, 2051, 2051, 2058, 2058, 2065, 2072, 2072, 2079, 2079, 2086, 2086, 2093, 2093, 2100, 2107, 2107, 2114, 2114, 2121, 2121, 2128, 2128, 2135, 2142, 2142, 2149, 2149, 2156, 2156, 2163, 2163, 2170, 2177, 2177, 2184, 2184, 2191, 2191, 2198, 2198, 2205, 2212, 2212, 2219, 2219, 2226, 2226, 2233, 2233, 2240, 2247, 2247, 2254, 2254, 2261, 2261, 2268, 2268, 2275, 2282, 2282, 2289, 2289, 2296, 2296, 2303, 2303, 2310, 2317, 2317, 2324, 2324, 2331, 2331, 2338, 2338, 2345, 2352, 2352, 2359, 2359, 2366, 2366, 2373, 2373, 2380, 2387, 2387, 2394, 2394, 2401, 2401, 2408, 2408, 2415, 2422, 2422, 2429, 2429, 2436, 2436, 2443, 2443, 2450, 2457, 2457, 2464, 2464, 2471, 2471, 2478, 2478, 2485, 2492, 2492, 2499, 2499, 2506, 2506, 2513, 2513, 2520, 2527, 2527, 2534, 2534, 2541, 2541, 2548, 2548, 2555, 2562, 2562, 2569, 2569, 2576, 2576, 2583, 2583, 2590, 2597, 2597, 2604, 2604, 2611, 2611, 2618, 2618, 2625, 2632, 2632, 2639, 2639, 2646, 2646, 2653, 2653, 2660, 2667, 2667, 2674, 2674, 2681, 2681, 2688, 2688, 2695, 2702, 2702, 2709, 2709, 2716, 2716, 2723, 2723, 2730, 2737, 2737, 2744, 2744, 2751, 2751, 2758, 2758, 2765, 2772, 2772, 2779, 2779, 2786, 2786, 2793, 2793, 2800, 2807, 2807, 2814, 2814, 2821, 2821, 2828, 2828, 2835, 2842, 2842, 2849, 2849, 2856, 2856, 2863, 2863, 2870, 2877, 2877, 2884, 2884, 2891, 2891, 2898, 2898, 2905, 2912, 2912, 2919, 2919, 2926, 2926, 2933, 2933, 2940, 2947, 2947, 2954, 2954, 2961, 2961, 2968, 2968, 2975, 2982, 2982, 2989, 2989, 2996, 2996, 3003, 3003, 3010, 3017, 3017, 3024, 3024, 3031, 3031, 3038, 3038, 3045, 3052, 3052, 3059, 3059, 3066, 3066, 3073, 3073, 3080, 3087, 3087, 3094, 3094, 3101, 3101, 3108, 3108, 3115, 3122, 3122, 3129, 3129, 3136, 3136, 3143, 3143, 3150, 3157, 3157, 3164, 3164, 3171, 3171, 3178, 3178, 3185, 3192, 3192, 3199, 3199]

    In the above if I use Set then the duplicates are removed, but still there are some numbers divisible by 5 are printed.

    e.g.
    list=[x for x in range(2000,3200) for questionaet,remainder in enumerate(divmod(x,7)) if remainder ==0]
    bridge=set([x for x in list for quo,rem in enumerate(divmod(x,5)) if rem !=0])
    print(bridge,end=" ")
     
    #19
  20. sangeetha.s(3501176)

    Alumni

    Joined:
    Aug 20, 2014
    Messages:
    1
    Likes Received:
    0
    Hi, I'm not able to access Jupyter Lab - getting ' 503 service unavailable ' error in Firefox browser. Have anyone faced a similar issue - what is the solution?
     
    #20
  21. Maja Nikolova

    Maja Nikolova Member

    Joined:
    Jan 31, 2020
    Messages:
    4
    Likes Received:
    0
    Hi All,

    can you please help me with the following (the question is related to the App Rating Project):

    1. when creating a boxplot for the variables Rating and Content Rating, the visual as attached is displayed.
    How from the boxplot I can conclude if there is any difference in the ratings?
    Moreover, how can I conclude if some types are liked better?

    2. when creating boxplot for Category and Rating, the boxplot as attached is displayed.
    How can I answer the question: which genre has the best rating? when I cannot read the data labels on the x-axis since they are one over another, I even rotated them vertically but even that did not help.

    Can you please help me asap?

    Many thanks in advance!
     

    Attached Files:

    #21
  22. Maja Nikolova

    Maja Nikolova Member

    Joined:
    Jan 31, 2020
    Messages:
    4
    Likes Received:
    0
    Hi All,

    since there is no example of log transformation throughout the course, I am facing some difficulties with the following:

    1. Reviews and Install have some values that are still relatively very high. Before building a linear regression model, you need to reduce the skew. Apply log transformation (np.log1p) to Reviews and Installs.
    I wrote the following code (2 options):

    1.
    import numpy as np
    in_array = inp1['Reviews', 'Installs']
    print ("Input array : ", in_array)
    out_array = np.log1p(in_array)
    print ("Output array : ", out_array)

    I receive the following error: KeyError: ('Reviews', 'Installs')
    upload_2020-8-9_19-38-1.png

    2.
    inp1['Reviews_norm'] = np.log1p(inp1['Reviews'])
    print ("Output array : ", 'Reviews_norm')

    I receive the following result:
    upload_2020-8-9_19-39-32.png

    And I have only two values transformed to log. Why are not all of them transformed?

    upload_2020-8-9_19-13-56.png


    2.

    inp1.drop(['App','Last Updated','Current Ver','Android Ver'],axis='columns', inplace = True)
     

    Attached Files:

    #22

Share This Page