Interview Guides

Batch Gradient Descent Linear Regression | Dataford Interview Questions - Dataford - Ace your Interview

Batch Gradient Descent Linear Regression

Hard

Coding

Regression

Problem

At Stripe, you need a tiny training routine for a linear pricing model. Implement batch gradient descent for linear regression with an intercept (bias).

Specification

Write train_linear_regression(X, y, lr, epochs) where:

X is a 2D list of floats with shape (n_samples, n_features)
y is a list of floats with length n_samples
lr is a positive float learning rate
epochs is a positive integer

Return a dictionary:

{"weights": [bias, w1, ..., wd], "preds": [p0, ..., p(n-1)]} where preds[i] is the prediction for X[i] using the final weights.

Raise ValueError if:

X or y is empty, or len(X) != len(y)
X is ragged (rows have different lengths) or has zero features

Examples

Example 1 Input: X = [[1.0],[2.0],[3.0]], y = [2.0,4.0,6.0], lr = 0.1, epochs = 2000 Output: weights ≈ [0.0, 2.0], preds ≈ [2.0, 4.0, 6.0] Explanation: Perfect line y = 2x.

Example 2 Input: X = [[0.0],[1.0]], y = [1.0,3.0], lr = 0.1, epochs = 2000 Output: weights ≈ [1.0, 2.0], preds ≈ [1.0, 3.0] Explanation: Line y = 1 + 2x.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
All values in X and y are finite floats in [-1e6, 1e6]

Examples

Example 1

InputX = [[1.0], [2.0], [3.0]], y = [2.0, 4.0, 6.0], lr = 0.1, epochs = 2000Output{"weights": [0.0, 2.0], "preds": [2.0, 4.0, 6.0]} (approximately)WhyThe data lies exactly on y = 2x, so the learned bias is near 0 and slope near 2.

Example 2

InputX = [[0.0], [1.0]], y = [1.0, 3.0], lr = 0.1, epochs = 2000Output{"weights": [1.0, 2.0], "preds": [1.0, 3.0]} (approximately)WhyTwo points define y = 1 + 2x; batch GD converges to that line.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
X[i][j] and y[i] are finite numbers in [-1e6, 1e6]

Function Signature

def train_linear_regression(X: list[list[float]], y: list[float], lr: float, epochs: int) -> dict:

Problem

At Stripe, you need a tiny training routine for a linear pricing model. Implement batch gradient descent for linear regression with an intercept (bias).

Specification

Write train_linear_regression(X, y, lr, epochs) where:

X is a 2D list of floats with shape (n_samples, n_features)
y is a list of floats with length n_samples
lr is a positive float learning rate
epochs is a positive integer

Return a dictionary:

{"weights": [bias, w1, ..., wd], "preds": [p0, ..., p(n-1)]} where preds[i] is the prediction for X[i] using the final weights.

Raise ValueError if:

X or y is empty, or len(X) != len(y)
X is ragged (rows have different lengths) or has zero features

Examples

Example 1 Input: X = [[1.0],[2.0],[3.0]], y = [2.0,4.0,6.0], lr = 0.1, epochs = 2000 Output: weights ≈ [0.0, 2.0], preds ≈ [2.0, 4.0, 6.0] Explanation: Perfect line y = 2x.

Example 2 Input: X = [[0.0],[1.0]], y = [1.0,3.0], lr = 0.1, epochs = 2000 Output: weights ≈ [1.0, 2.0], preds ≈ [1.0, 3.0] Explanation: Line y = 1 + 2x.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
All values in X and y are finite floats in [-1e6, 1e6]

Examples

Example 1

Example 2

InputX = [[0.0], [1.0]], y = [1.0, 3.0], lr = 0.1, epochs = 2000Output{"weights": [1.0, 2.0], "preds": [1.0, 3.0]} (approximately)WhyTwo points define y = 1 + 2x; batch GD converges to that line.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
X[i][j] and y[i] are finite numbers in [-1e6, 1e6]

Function Signature

def train_linear_regression(X: list[list[float]], y: list[float], lr: float, epochs: int) -> dict:

Practice Python

Python 3.10

Open on desktop for the full Python editor with syntax highlighting and autocomplete.

Up next

Diagnose Linear Regression Assumptions in PricingMedium URegularize House Price RegressionEasy Train House Prices with Gradient DescentEasy

Next question

Batch Gradient Descent Linear Regression

Hard

Coding

Regression

Problem

At Stripe, you need a tiny training routine for a linear pricing model. Implement batch gradient descent for linear regression with an intercept (bias).

Specification

Write train_linear_regression(X, y, lr, epochs) where:

X is a 2D list of floats with shape (n_samples, n_features)
y is a list of floats with length n_samples
lr is a positive float learning rate
epochs is a positive integer

Return a dictionary:

{"weights": [bias, w1, ..., wd], "preds": [p0, ..., p(n-1)]} where preds[i] is the prediction for X[i] using the final weights.

Raise ValueError if:

X or y is empty, or len(X) != len(y)
X is ragged (rows have different lengths) or has zero features

Examples

Example 1 Input: X = [[1.0],[2.0],[3.0]], y = [2.0,4.0,6.0], lr = 0.1, epochs = 2000 Output: weights ≈ [0.0, 2.0], preds ≈ [2.0, 4.0, 6.0] Explanation: Perfect line y = 2x.

Example 2 Input: X = [[0.0],[1.0]], y = [1.0,3.0], lr = 0.1, epochs = 2000 Output: weights ≈ [1.0, 2.0], preds ≈ [1.0, 3.0] Explanation: Line y = 1 + 2x.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
All values in X and y are finite floats in [-1e6, 1e6]

Examples

Example 1

Example 2

InputX = [[0.0], [1.0]], y = [1.0, 3.0], lr = 0.1, epochs = 2000Output{"weights": [1.0, 2.0], "preds": [1.0, 3.0]} (approximately)WhyTwo points define y = 1 + 2x; batch GD converges to that line.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
X[i][j] and y[i] are finite numbers in [-1e6, 1e6]

Function Signature

def train_linear_regression(X: list[list[float]], y: list[float], lr: float, epochs: int) -> dict:

Problem

At Stripe, you need a tiny training routine for a linear pricing model. Implement batch gradient descent for linear regression with an intercept (bias).

Specification

Write train_linear_regression(X, y, lr, epochs) where:

X is a 2D list of floats with shape (n_samples, n_features)
y is a list of floats with length n_samples
lr is a positive float learning rate
epochs is a positive integer

Return a dictionary:

{"weights": [bias, w1, ..., wd], "preds": [p0, ..., p(n-1)]} where preds[i] is the prediction for X[i] using the final weights.

Raise ValueError if:

X or y is empty, or len(X) != len(y)
X is ragged (rows have different lengths) or has zero features

Examples

Example 1 Input: X = [[1.0],[2.0],[3.0]], y = [2.0,4.0,6.0], lr = 0.1, epochs = 2000 Output: weights ≈ [0.0, 2.0], preds ≈ [2.0, 4.0, 6.0] Explanation: Perfect line y = 2x.

Example 2 Input: X = [[0.0],[1.0]], y = [1.0,3.0], lr = 0.1, epochs = 2000 Output: weights ≈ [1.0, 2.0], preds ≈ [1.0, 3.0] Explanation: Line y = 1 + 2x.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
All values in X and y are finite floats in [-1e6, 1e6]

Examples

Example 1

Example 2

InputX = [[0.0], [1.0]], y = [1.0, 3.0], lr = 0.1, epochs = 2000Output{"weights": [1.0, 2.0], "preds": [1.0, 3.0]} (approximately)WhyTwo points define y = 1 + 2x; batch GD converges to that line.

Constraints

1 <= n_samples <= 2 * 10^4
1 <= n_features <= 50
1 <= epochs <= 10^4
0 < lr <= 1
X[i][j] and y[i] are finite numbers in [-1e6, 1e6]

Function Signature

def train_linear_regression(X: list[list[float]], y: list[float], lr: float, epochs: int) -> dict:

Practice Python

Python 3.10

Open on desktop for the full Python editor with syntax highlighting and autocomplete.

Up next

Diagnose Linear Regression Assumptions in PricingMedium URegularize House Price RegressionEasy Train House Prices with Gradient DescentEasy

Next question