Before you begin data mining

Discussion in 'Big Data and Analytics' started by Jasper Price, Feb 5, 2015.

  1. Jasper Price

    Jasper Price Member

    Joined:
    Dec 12, 2014
    Messages:
    13
    Likes Received:
    2
    Before you begin any data mining operation, you should have a clear picture of where you are going and what is to be achieved. Maybe one of the first questions you should ask yourself before beginning big data analysis is 'What problem am I trying to solve?'. Finding the answer to the question will illuminate your path.

    Specifically, when it comes to your own business, please tell me typical answers to the above question. For instance, are you searching for shopper patterns on your website? Do you want to know why your traffic is less at some times of day? What answers are you seeking from your data?
     
    #1
  2. Stewart Kelly

    Stewart Kelly Member

    Joined:
    Nov 25, 2014
    Messages:
    7
    Likes Received:
    2
    What computer language and software will you be using for your data mining process? If you are using SAS, R, or Python, then you are truly data minng. Just use caution as SAS programmers won't necessarily have data mining skills. though this language is commonly used for data mining. Knowing which language/software you will use before you begin is essential to your success.
     
    #2
  3. Kristopher McGee

    Joined:
    Dec 14, 2014
    Messages:
    13
    Likes Received:
    3
    The issue with data collection, such as with surveys is how often a respondent won't include accurate information. For instance, a higher income individual responding to a survey might not include full details of salary. So if you were using a survey of a group, your information would be contaminated. You would need to gain more accurate data before trying to extrapolate any meaning from it. This is one way that your data would need to be cleaned up.
     
    #3
  4. Evan Berry

    Evan Berry Member

    Joined:
    Nov 25, 2014
    Messages:
    7
    Likes Received:
    0
    Once you have your data, you will want to formulate a hypothesis. For instance, your hypothesis might be 'all home owners in a specific city have an income of at least $50,000'. You want to prove or disprove this hypothesis as so far it is unproven. You would query your data to extract results as to whether or not this is true.
     
    #4

Share This Page