Microsoft Word fidis-wp2-de models doc

Yüklə 0,65 Mb.

Pdf görüntüsü

səhifə	8/30
tarix	24.04.2018
ölçüsü	0,65 Mb.
	#40095

1 ... 4 5 6 7 8 9 10 11 ... 30

FIDIS Future of Identity in the Information Society (No. 507512) D2.3
2.3.3 Data calculated and inferred from other attributes
2.3.4 Data extracted via mining the information

FIDIS

Future of Identity in the Information Society (No. 507512)

D2.3

[Final], Version: 2.0

File: fidis-wp2-del2.3.models.doc

Page 18

2.3.2 The extraction from data sources and from processes

In this case, the values associated to the attributes originate from two different sources: (1)

databases; and (2) processes.

In the first case, the databases may be governmental (such as police or tax), human resource

databases (enterprise resource planning and knowledge management systems such as payrolls,

or training information) or health file databases (managed by hospitals or by social security

units).

In the second case, the data can originate from a series of processes that can be used to

capture the data (and that will be stored in databases). Examples of such processes include e-

commerce systems (such as Amazon) and fidelity programs that can capture the history of

different transactions associated with each of the customers, or virtual community systems

that can capture the history of activities of the different members (such as age in the

community, and number of posting).

The type 1 IMS (organisational function), presented previously, represents a typical category

of systems that employs this method, although it can also be used in the type 3 IMS

(individual function).

The personal data that is present in databases or captured via a set of processes is mostly

outside the user’s control (the possibilities of correction by the end user are often limited).

These data are also often very regulated by some legislation specifying the type of data that

can be represented, the possible usage of this data, including combining databases.

Even if this mode of collection of personal data appears to be more intrusive to people’s

privacy, it is not without some advantages, even for the people themselves. First, the data

captured via this means can be considered much more reliable, since it directly reflects the

activities of people, and not only the perception of these activities. Second, because this data

collection is automatic, it can be considered less demanding for the end-users.

The values of many attributes that can be recorded in this way include characteristics that

have a certain level of permanence, while other categories of person’s information can

include all the transactions (commercial or not) in which the people have been engaged.

2.3.3 Data calculated and inferred from other attributes

In this case, unknown values associated to particular attributes originate from the calculation

of other attributes (typically the ones that have been extracted from the previous two

methods). This category is relatively similar to the category previous described, however, it

differs in the level of sophistication of the systems that make use of it. Notably, these are

more frequently used in Type 3 IMS (individual function) applications that use it to provide

some level of adaptability (for instance in e-learning systems or e-commerce systems).

The reliability of these calculated attributes is generally less accurate than for non-calculated

attributes. For instance in Amazon the assertion “a customer that has bought a book about

children is interested by children and is likely to buy other books about children” is only

correct in average, since they may only have bought this book once in order to offer a present

to somebody else.

FIDIS

Future of Identity in the Information Society (No. 507512)

D2.3

[Final], Version: 2.0

File: fidis-wp2-del2.3.models.doc

Page 19

The level of control on these calculated attributes is often limited by the simplicity of the

algorithm used, and the way it was configured for the calculation. Thus, people that read the

value of these attributes usually have, at best, only a vague idea about the underlying

principles that have been used. For instance, a calculated attribute could be a level of risk that

a bank could calculate on a particular client, which results from a combination of values of

attributes such as the gross salary of the person, the assets such as real-estates that the person

may own, his family status, or the postal code of his place of living or even his ethnic origin.

Another application is certain e-commerce websites, where the preferences of a customer are

determined automatically.

2.3.4 Data extracted via mining the information

The extraction of values via data mining techniques could appear similar to the previous

calculated methods. They differ however in that the algorithms are being applied globally to

the data of (very large) groups of people, and not on the data set that is associated with a

single person. The algorithms used are also of a more statistical and probability based nature,

and often rely on the use of Heuristics. Finally, these algorithms may also be used to help the

creation process of the user model itself, and in particular help to determine the set of

attributes required to “summarise” the problem (for instance, in a banking application, an

algorithm may determine that the knowledge of the age and of the postal code information

represent sufficient information to discriminate a reliable customer from an unreliable one,

with a limited risk of error).

Type 2 IMS (profiling function), presented previously, represent a typical category of systems

that employs this method.

The types of attributes that are extracted via mining typically include people related categories

such as social categories or life styles. These attributes can be considered to be more abstract

and less directly associated to the individuals.

At a more micro-level, these attributes can represent some user characteristics and behaviours

that can be automatically extracted from the use of some Information Systems. For instance

such attributes, in the context of an e-commerce system, can reflect reliability characteristics

(likeliness of fraud), and, in the context of a virtual community, can reflect the level of

participation (such as the activity of the people in SourceForge.net).

Yüklə 0,65 Mb.

Dostları ilə paylaş:

1 ... 4 5 6 7 8 9 10 11 ... 30