Regression & Classification

Regression
supervised learning: take examples of inputs and outputs. Now, given a new input, predict its output.

Mapping continuous inputs to outputs.
discrete, continuous

child Height, parent height
2/3 < 1, regression to mean Reinforcement learning Regression in machine learning Finding the best constant function f(x) = c E(c) = Σi=1(yi-c)^2 LOSS, ERROR Order of polynomial k = 0:constant k = 1:line k = 2:parabola f(x) = c0 + cix + c2x^2 + ... ckX^k polynomial regression c0 + c1x + c2x^2 + c3x^3 = y Errors Training data has errors not modeling f, but f + ε where do errors come from? sensor error Cross Validation Fundamental assumption use a model that is complex enough to fit the data without causing problems on the test set -training error -cross validation error -> scalar input, continuous
-> vector input, continuous
include more input features (size, distance from zoo)

predict credit score
job? age? assets?
-> distance, vector or scalar

ID3

LOOP:
-A <- best attribute -Assign A as decision attribute for Node -For each value of A create a deescalate of node -Sort training examples to create -If examples perform classified stop else iterate over leaves gain(s,a) = entropy(s) - Σv |Sv| / |S| entropy(Sv) -Σv P(v) logP(v) ID3: Bias INDUCTIVE BIAS Restriction bias: H Preference bias: -good spots at top -correct over incorrect -shorter trees Decision trees: other considerations - continuous attributes? e.g. age, weight, distance When do we stop? -> everything classified correctly!
-> no more attribute!
-> no overfitting
Regression
splitting? variance?
output average, local linear fit

Decision Trees

Supervised Learning
classification: true or false
regression

Credit history: lend money? -> classification: binary task

classification learning
– instances (input)
– concept function -> T,F
– target concept -> actual answer
– hypothesis -> class, all functions
– sample (training set)
– candidate: concept = target concept
– testing set

Decision Tree
entry: type italian, french, thai
atmosphere: fancy, hiw, casual
occupied
hot date?
cost, hungry, raining

node -> values -> attribute

representation vs algorithm

Decision Trees: Learning
1. Pick best attribute
Best ~ splits the data
2. Asked question
3. Follow the answer path
4. Go to 1
on til got an answer

Decision trees: Expressioness
Boolean
A and B, A or B, A xor B

n-or:any, n-xor:parity(odo)

XOR is hard, n attributes(boolean) o(n!), how many trees?, output is boolean

Truth table
a1, a2, a3, …△n, output
y, t, t … t
t, t, t … t

Philosophy of Machine Learning

Theoretical, Pratical
What is machine learning? × Proving theorems
computational statistics
broader notion of building computational artifacts that learn over time based on experience.

-supervised learning
-unsupervised learning
-reinforcement learning

1:1, 2:4, 3:9, 4:16, 5:25, 6:36
output <- input ^2 induction and deduction supervised learning = approximation unsupervised learning = description pixels -> Function approximator -> labels
Reinforcement learning

Optimization
supervised learning: labels data well
reinforcement learning: behavior scores well
unsupervised learning: cluster wrests well

Localization Tools

project preparation tools
project execution tools
quality assurance tools

translation management, terminology management, translation memory management

In-Context Match Exact(ICE): 100% with context
Exact Match:100%
Fuzzy Match: lower than 99%
No Match: all new words

Localize process

1. Product Preparation
2. Project Preparation
3. Project Execution
4. Quality Assessment

Language tiering
chinese, spanish, english, hindi, arabic, russian

Decide languages -> understand issues and solution -> start designing the app

internationalization:
Process of generalizing a product to handle multiple languages and cultural conventions. It happens during software development.
Design & Engineering, Testing
Density and Fonts, Layout, Spacing, Message Description, Dates, currencies, units, addresses, phone numbers, Plurals and Genders

Psuedolocalization:
Simulation of localized text by replacing source text with fake characters, Improperly Mirrored Interface

Untranslated Text, Not enough space

Project preparation: Project Evaluation: quotation, schedule
overview of project, estimated number of words, deliverables, deadlines, costs

Localization kit
-content to be translated, terminology, translation memories, style guide, reference material
Glossary: Contain the terms that are commonly used in the project and that need to be consistent throughout.
Translation Memory: database that contains all the previously translated segments from a product
Style guide: document outlining a set of standards and best practices in terms of how to handle a specific project

Project execution: Translation
Quality Evaluation Tools

Localization Project

e.g. google Product team
– develop
– introduce new features
– introduce new versions

-> google localization team
-> localization production, language services, vendor management, localization operations
Localization Project Manager(LPM)
-> external localization company: language service provider(LSP)
Language managers(Spanish, Hindi, Traditional Chinese)
Lastly product team launch service globally

Localization operations
– technology, business
Vendor Management finds LSPs, builds relationships

-Requesters, localization project managers, language mangers, localization operations, vendor mangement, external language service providers

User Interface

User Interface:
The space where interactions between humans and machines occur.

e.g. ATM
Withdraw, Deposit, Manage Accounts, Account Blance

Desktop software: applications you download and install onto laptop or desktop computers
Web apps: Similar to desktop applications, but they run on mobile phones and have different considerations
Mobile apps: Similar to traditional software, to use them, you don’t need to install

Where?
UI is found on desktop, web and on the phone.
Who?
UI is used by new users as well as users familiar with the product.
Why?
UI enable users to accomplish goals.
-> lack of product knowledge
Providing message descriptions, Providing Reference material and guidelines

Research what people search in local.
Telefoni Cellulari, Cellulari

English, Turkish, Turkish back translation
hotel chain, otel zinciri, hotel chain
holidays, tail, holiday vacation
hospitality, misavirperverlik, generosity
istanbul hotels, istanbul otelleri, istanbul hotels

localization

In house: engineers, translation team, project manager

Marketing Content: Engaging, Persuasive, Well Written
– First contact with a product
– Used by anyone interested in the product
– Designed to attract potential users

Online help
– FAQS
– Software Documentation
– Troubleshooting Manuals

Product knowledge, Consistent Terminology

Where?
videos are in online help pages, marketing pages, training platforms etc.

Revoicing
voice-over, dubbing, narration, audio description, free commentary

Dubbing:Actors’ voices are recorded over the original audio track
Subtitling:Written translation of spoken words and on-screen text

Nullability Annotations

_Nullable -can have a nil value
-Nonnull -not expected to be nil

@interface Book : NSObject

@property (nonatomic, copy) NSString *title;
@property (nonatomic) Person *author;
@property (nonatomic) Person *editor;
@property (nonatomic) int yearOfPublication;

-(instancetype)initWithTitle:(NSString*)title
	author:(Person*)author
	year:(int)year;

@end