5 Questions on Tayko Software Reseller - Examination | DSC 433 | Exams Humanities

Tayko Software Reseller

Tayko Software is a software catalog firm that sells games and educational software. It started

out as a software manufacturer, and added third party titles to its offerings. It has recently put

together a revised collection of items in a new catalog, which it is preparing to roll out in a

mailing.

In addition to its own software titles, Tayko's customer list is a key asset. In an attempt to grow

its customer base, it has recently joined a consortium of catalog firms that specialize in computer

and software products.

The consortium affords members the opportunity to mail catalogs to names drawn from a pooled

list of customers. Members supply their own customer lists to the pool, and can "withdraw" an

equivalent number of names each quarter. Members are allowed to do predictive modeling on

the records in the pool so they can do a better job of selecting names from the pool.

Tayko has supplied its customer list of 200,000 names to the pool, which totals over 5,000,000

names, so it is now entitled to draw 200,000 names for a mailing. Tayko would like to select the

names that have the best chance of performing well, so it conducts a test – it draws 20,000 names

from the pool and does a test mailing of the new catalog to them.

This mailing yielded 1065 purchasers – a response rate of 0.05325. To optimize the performance

of the data mining techniques, it was decided to work with a stratified sample that contained

equal numbers of purchasers and non-purchasers. For ease of presentation, the data set for this

case includes just 1000 purchasers and 1000 non-purchasers, an apparent response rate of 0.5.

Therefore, after using the data set to predict who will be a purchaser, we must adjust the

purchase rate back down by multiplying each case’s "probability of purchase" by 0.05325/0.5 or

0.1065.

There are two response variables in this case: "purch" indicates whether or not a prospect

responded to the test mailing and purchased something, while "spend" indicates, for those who

made a purchase, how much they spent. The overall procedure in this case will be to develop

two models. One will be used to classify records as "purchase" or "no purchase." The other will

be used for those cases that are classified as "purchase," and will predict the amount they will

spend.

The following table provides a description of the variables available in this case. A partition

variable is used because we will be developing two different models in this case and we want to

preserve the same partition structure for assessing each model.

5 Questions on Tayko Software Reseller - Examination | DSC 433, Exams of Humanities

Related documents

Partial preview of the text

Download 5 Questions on Tayko Software Reseller - Examination | DSC 433 and more Exams Humanities in PDF only on Docsity!

Tayko Software Reseller

Tayko Software is a software catalog firm that sells games and educational software. It started

out as a software manufacturer, and added third party titles to its offerings. It has recently put

together a revised collection of items in a new catalog, which it is preparing to roll out in a

mailing.

In addition to its own software titles, Tayko's customer list is a key asset. In an attempt to grow

its customer base, it has recently joined a consortium of catalog firms that specialize in computer

and software products.

The consortium affords members the opportunity to mail catalogs to names drawn from a pooled

list of customers. Members supply their own customer lists to the pool, and can "withdraw" an

equivalent number of names each quarter. Members are allowed to do predictive modeling on

the records in the pool so they can do a better job of selecting names from the pool.

Tayko has supplied its customer list of 200,000 names to the pool, which totals over 5,000,

names, so it is now entitled to draw 200,000 names for a mailing. Tayko would like to select the

names that have the best chance of performing well, so it conducts a test – it draws 20,000 names

from the pool and does a test mailing of the new catalog to them.

This mailing yielded 1065 purchasers – a response rate of 0.05325. To optimize the performance

of the data mining techniques, it was decided to work with a stratified sample that contained

equal numbers of purchasers and non-purchasers. For ease of presentation, the data set for this

case includes just 1000 purchasers and 1000 non-purchasers, an apparent response rate of 0.5.

Therefore, after using the data set to predict who will be a purchaser, we must adjust the

purchase rate back down by multiplying each case’s "probability of purchase" by 0.05325/0.5 or

There are two response variables in this case: "purch" indicates whether or not a prospect

responded to the test mailing and purchased something, while "spend" indicates, for those who

made a purchase, how much they spent. The overall procedure in this case will be to develop

two models. One will be used to classify records as "purchase" or "no purchase." The other will

be used for those cases that are classified as "purchase," and will predict the amount they will

spend.

The following table provides a description of the variables available in this case. A partition

variable is used because we will be developing two different models in this case and we want to

preserve the same partition structure for assessing each model.

Var. # Variable Name Description Variable Type Code Description

1. usa Is it a US address? binary 1: yes 0: no

2 - 16 s_* Source catalog for the record binary 1: yes 0: no

(15 possible sources)

17. freq Number of transactions in last year at

source catalog

numeric

18. last How many days ago was last update to

cust. record

numeric

19. first How many days ago was 1st update to

cust. record

numeric

20. web Customer placed at least 1 order via

web

binary 1: yes 0: no

21. male Customer is male binary 1: yes 0: no

22. resid Address is a residence binary 1: yes 0: no

23. purch Person made purchase in test mailing binary 1: yes 0: no

24. spend Amount spent by customer in test

mailing ($)

numeric

25. part Variable indicating which partition the

record will be assigned to

alpha t: training v: validation

s:test

Codelist