API
src.cfa_subgroup_imputer.groups
Submodule for broad-sense handling of supergroups and subgroups.
Group
A class to represent a super or subgroup.
Source code in src/cfa_subgroup_imputer/groups.py
21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 | |
__init__(name, attributes=[], filter_on=None)
Group constructor.
Parameters:
-
name(Hashable) –Name defining the group.
-
attributes(Iterable[Attribute], default:[]) –Attributes currently attached to the group.
-
filter_on(Iterable[str] | None, default:None) –Keys used to identify this group in tabular JSON-like data.
Source code in src/cfa_subgroup_imputer/groups.py
add_attribute(attribute)
Return a new group with one additional attribute.
Parameters:
-
attribute(Attribute) –Attribute to append.
Returns:
-
Group–A new group containing all existing attributes plus
attribute.
Source code in src/cfa_subgroup_imputer/groups.py
get_attribute(name)
Retrieve a named attribute.
Parameters:
-
name(Hashable) –Attribute name to retrieve.
Returns:
-
Attribute–The matching attribute.
Source code in src/cfa_subgroup_imputer/groups.py
get_attributes(names)
Retrieve multiple named attributes.
Parameters:
-
names(Iterable[Hashable]) –Attribute names to retrieve.
Returns:
-
list–Attributes in the same order as
names.
Source code in src/cfa_subgroup_imputer/groups.py
rate_to_count(size_from='size')
Convert imputable rate-like attributes into count-like attributes.
Parameters:
-
size_from(Hashable, default:'size') –Name of the attribute containing group size.
Returns:
-
Group–A new group with converted measurement types where applicable.
Source code in src/cfa_subgroup_imputer/groups.py
restore_rates(size_from='size')
Convert imputable count-from-rate attributes back to rates.
Parameters:
-
size_from(Hashable, default:'size') –Name of the attribute containing group size.
Returns:
-
Group–A new group with restored rate-like attributes where applicable.
Source code in src/cfa_subgroup_imputer/groups.py
GroupMap
A class that binds supergroups and subgroups together.
Source code in src/cfa_subgroup_imputer/groups.py
281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 | |
supergroup_names
property
Get all supergroup names.
__init__(sub_to_super, groups)
Default constructor, takes in a subgroup : supergroup dict, and, optionally, groups.
If no groups are provided, empty groups are created.
Source code in src/cfa_subgroup_imputer/groups.py
add_attribute(group_type, attribute_name, attribute_values, impute_action, attribute_class, measurement_type=None, attribute_json_values=None)
Bulk addition of attributes to all sub or supergroups.
Parameters:
-
group_type(GroupType) –Should the attribute be added to supergroups or subgroups?
-
attribute_name(Hashable) –The name of the attribute to be added.
-
attribute_values(dict[Hashable, object]) –For all groups of the specified type, the values of the attribute to be added.
-
impute_action(ImputeAction) –The impute_action for the attribute to be added.
-
attribute_class(type[Attribute] | type[ImputableAttribute]) –The class of the attribute to be added.
-
measurement_type(MeasurementType | None, default:None) –The measurement type of the attribute to be added, if it is an ImputableAttribute.
-
attribute_json_values(dict[Hashable, object] | None, default:None) –If the
attribute_valuesare not something recorded directly in the json, this specifies how the values will be compared against json values and how they will be exported to json. None means to use theattribute_values.
Source code in src/cfa_subgroup_imputer/groups.py
data_from_dicts(data, group_type, exclude, count, copy, rate)
Populates measurements and attributes for groups found in the data.
Source code in src/cfa_subgroup_imputer/groups.py
from_supergroups(super_to_sub, groups)
classmethod
Alternative constructor, takes in a supergroup : [subgroups] dict.
Source code in src/cfa_subgroup_imputer/groups.py
make_many_to_one(super_to_sub)
staticmethod
Inverts a supergroup : [subgroups] one to one dict to a subgroup : supergroup one to many dict
Source code in src/cfa_subgroup_imputer/groups.py
make_one_to_many(sub_to_super)
staticmethod
Inverts a subgroup : supergroup one to one dict to a supergroup : [subgroups] one to many dict
Source code in src/cfa_subgroup_imputer/groups.py
subgroup_names(name=None)
Get names of subgroups this supergroup contains
Source code in src/cfa_subgroup_imputer/groups.py
to_dicts(group_type)
Creates a list of dicts of the measurements in either the supergroups or subgroups.
Source code in src/cfa_subgroup_imputer/groups.py
src.cfa_subgroup_imputer.imputer
Module for imputation machinery.
Aggregator
A class which aggregates subgroups.
Source code in src/cfa_subgroup_imputer/imputer.py
122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 | |
__call__(map)
Impute and aggregate the given group map.
Source code in src/cfa_subgroup_imputer/imputer.py
Disaggregator
A class which imputes and disaggregates subgroups.
Source code in src/cfa_subgroup_imputer/imputer.py
__call__(map)
Impute and disaggregate the given group map.
Source code in src/cfa_subgroup_imputer/imputer.py
src.cfa_subgroup_imputer.json
Module for interfacing with JSON-style inputs.
aggregate(supergroup_data, subgroup_data, subgroup_to_supergroup, supergroups_from, subgroups_from, group_type, loop_over=[], rate=[], count=[], exclude=[], size_from='size', **kwargs)
Wrapper for impute with action="aggregate".
Source code in src/cfa_subgroup_imputer/json.py
create_group_map(supergroup_data, subgroup_data, subgroup_to_supergroup, supergroups_from, subgroups_from, group_type, **kwargs)
GroupMap construction utility for disaggregate. See there for more details.
Source code in src/cfa_subgroup_imputer/json.py
disaggregate(supergroup_data, subgroup_data, subgroup_to_supergroup, supergroups_from, subgroups_from, group_type, loop_over=[], rate=[], count=[], exclude=[], size_from='size', **kwargs)
Wrapper for impute with action="disaggregate".
Source code in src/cfa_subgroup_imputer/json.py
impute(action, supergroup_data, subgroup_data, subgroup_to_supergroup, supergroups_from, subgroups_from, group_type, loop_over=[], rate=[], count=[], exclude=[], size_from='size', **kwargs)
Takes in data for supergroups/subgroups, imputes and returns values for the subgroups/supergroups.
Parameters:
-
action(Literal['aggregate', 'disaggregate']) –Whether to aggregate or disaggregate.
-
supergroup_data(Iterable[dict[str, Any]]) –Information defining supergroups, including any data to disaggregate.
-
subgroup_data(Iterable[dict[str, Any]]) –Information defining the subgroups, including any data to aggregate.
-
subgroup_to_supergroup(Iterable[dict[str, Any]] | None) –Optional mapping defining all subgroup : supergroup.
-
supergroups_from(str) –Name of key in
supergroup_datadefining supergroups. -
subgroups_from(str) –Name of key in
subgroup_datadefining subgroups. -
group_type(GroupableTypes | None) –What kind of groups are these, categorical or age? Can only be None if providing
subgroup_to_supergroup. -
loop_over(Collection[str], default:[]) –A collection of covariates, within each combination of which we will separately disaggregate. For example, if we wanted to disaggregate age groups separately in every state and county in a dataset, this would be ["state", "county"].
-
rate(Collection[str], default:[]) –A list of the keys in
supergroup_datawhich define rate measurements. -
count(Collection[str], default:[]) –A list of the keys in
supergroup_datawhich define count measurements. -
exclude(Collection[str], default:[]) –A list the keys in
supergroup_datawhich define variables which are to be excluded from imputation and which will not be present in the output. -
**kwargs–Passed to internals.
Returns:
-
list[dict[str, Any]]–Data with measurements imputed for the subgroups.
Source code in src/cfa_subgroup_imputer/json.py
88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 | |
src.cfa_subgroup_imputer.mapping
Submodule for enumerating subgroup and supergroup maps.
AgeGroupHandler
A class for working with age groups.
Implements: cfa_subgroup_imputer.enumerator.Mapper
Source code in src/cfa_subgroup_imputer/mapping.py
173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 | |
STR_AGE_RANGE_CONVERTERS = ((re.compile('^(\\d+) years*'), lambda x: (float(x[0]), float(x[0]) + 1.0)), (re.compile('^(\\d+)\\+ years'), lambda x: (float(x[0]), inf)), (re.compile('^(\\d+)-(\\d+) years'), lambda x: (float(x[0]), float(x[1]) + 1.0)), (re.compile('^(\\d+)-<(\\d+) years*'), lambda x: (float(x[0]), float(x[1]))), (re.compile('^(\\d+) months*-(\\d+) years*'), lambda x: (float(x[0]) / 12.0, float(x[1]) + 1.0)), (re.compile('^(\\d+) months*-<(\\d+) years*'), lambda x: (float(x[0]) / 12.0, float(x[1]))), (re.compile('^(\\d+)-(\\d+) months*'), lambda x: (float(x[0]) / 12.0, float(x[1]) + 1.0 / 12.0)), (re.compile('^(\\d+)-<(\\d+) months*'), lambda x: (float(x[0]) / 12.0, float(x[1]) / 12.0)))
class-attribute
instance-attribute
The master list of age ranges we can convert.
Each element is a tuple of
- A regex which can extract the single age or the low/high ages and
- A function that returns a (low, high) tuple for ages in years
age_range_from_str(x)
Parse an age-group string into a lower and upper bound in years.
Source code in src/cfa_subgroup_imputer/mapping.py
age_ranges_equivalent(x, y)
True if the age ranges encode the same values, else False.
E.g., 1-3 years and 1-<4 years imply age group of 1, 2, and 3 year olds.
Source code in src/cfa_subgroup_imputer/mapping.py
construct_group_map(supergroups, subgroups, **kwargs)
Construct a group map by assigning each subgroup to a containing supergroup.
Parameters:
-
supergroups(Iterable[str]) –Supergroup labels.
-
subgroups(Iterable[str]) –Subgroup labels.
-
**kwargs–Optional options, including
continuous_var_nameandmissing_option.
Returns:
-
GroupMap–Group mapping with age-range attributes and filters populated.
Source code in src/cfa_subgroup_imputer/mapping.py
270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 | |
Mapper
Bases: Protocol
A class that assists in making sub to supergroup maps for an underlying axis defined by a continuous variable, such as age.
E.g., something that takes you from "my age subgroups are... and my age supergroups are..." to a sub : super group name/string dict.
Source code in src/cfa_subgroup_imputer/mapping.py
OuterProductSubgroupHandler
A class for handling subgroups based on a categorical variable, where all categories (levels) of the subgrouping variable are found in all supergroups.
For example, if we have age-based supergroups [0-17 years, 18-64 years, 65+ years] and want [low, moderate, high]-risk subgroups, this class makes and handles creating all "0-17 years, low risk", ..., "65+ years, high risk" subgroups and mapping them to the supergroups.
Source code in src/cfa_subgroup_imputer/mapping.py
construct_group_map(supergroup_categories, subgroup_categories, supergroup_variable_name, subgroup_variable_names, **kwargs)
Constructs a GroupMap from all subgroups defined by the categories of subgroup and supergroup variables.
Parameters:
-
supergroup_categories(Sequence[Hashable]) –The catgegories of the variable defining the supergroups.
-
subgroup_categories(Sequence[Sequence[Hashable]]) –For each variable defining subgroups, the catgegories it can take.
-
supergroup_variable_name(str) –What is the variable that defines the supergroup?
-
subgroup_variable_names(Sequence) –What are the variables that defines the subgroups?
Source code in src/cfa_subgroup_imputer/mapping.py
RaggedOuterProductSubgroupHandler
Bases: ABC
Source code in src/cfa_subgroup_imputer/mapping.py
construct_group_map(**kwargs)
Uses category combinations to construct a GroupMap. Each inner combination defines, in order, the category in each variable that defines a group. The last variable is taken to be the one which defines the supergroup.
E.g [["low risk", "child"], ["high risk", "child"], ["low risk", "adult"],] defines two supergroups, "child" and "adult", and three subgroups, "low risk child", "high risk child", and "low risk adult".
If provided, variable_names is used
when populating the group attributes.
Source code in src/cfa_subgroup_imputer/mapping.py
src.cfa_subgroup_imputer.utils
get_json_keys(x)
Get keys from list of dicts and make sure they are sync'd and all strs.
Source code in src/cfa_subgroup_imputer/utils.py
get_keys(x)
Get keys from list of dicts and make sure they're sync'd.
Source code in src/cfa_subgroup_imputer/utils.py
select(x, keys)
Get a list of dicts with only the specified keys.
unique(x)
Get only the rows out of an iterable of dicts that are unique.
Source code in src/cfa_subgroup_imputer/utils.py
src.cfa_subgroup_imputer.variables
Submodule for handling variables, whether measurements or quantities used to define subgroups.
ImputeAction = Literal['impute', 'copy', 'ignore']
module-attribute
What should be done with this value when disaggregating? - "impute" means the value will be imputed (must be ) - "copy" means the value from the supergroup will be copied to all subgroups - "ignore" means this value is not propagated from supergroups to subgroups
MeasurementType = Literal['count', 'rate', 'count_from_rate', 'rate_from_count']
module-attribute
How a measurement behaves for disaggregation.
Mass-like behavior are things like counts, while density-like measurements are things like rates or proportions.
Attribute
A class for data we can associate with a subgroup.
Source code in src/cfa_subgroup_imputer/variables.py
__init__(value, name, impute_action, json_value=None)
Attribute constructor.
Parameters:
-
value(Any) –The value of the variable.
-
name(Hashable) –What is this variable? E.g., "size" or "vaccination rate"
-
impute_action(ImputeAction) –What should we do with this measurement when disaggregating? Note that just because we can impute it doesn't mean we will.
-
json_value(Any | None, default:None) –If the
valueis not something recorded directly in a dataframe, this specifies how to compare to values in json and how to output this value to a json. None means to use the value.
Source code in src/cfa_subgroup_imputer/variables.py
ImputableAttribute
Bases: Attribute
A class for data we can associate with a subgroup and which can be imputed to subgroups.
Source code in src/cfa_subgroup_imputer/variables.py
103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 | |
__init__(value, name, impute_action, measurement_type, json_value=None)
ImputableAttribute constructor.
Parameters:
-
value(float) –The value, e.g. a number of cases.
-
name(Hashable) –What is this variable? E.g., "size" or "vaccination rate"
-
impute_action(ImputeAction) –What should we do with this measurement when disaggregating? Note that just because we can impute it doesn't mean we will.
-
measurement_type(MeasurementType) –What kind of imputable attribute is this?
-
json_value(Any | None, default:None) –If the
valueis not something recorded directly in a dataframe, this specifies how to compare to values in json and how to output this value to a json. None means to use the value.
Source code in src/cfa_subgroup_imputer/variables.py
Range
A slice of a one-dimensional variable. e.g. [0, 3.14159).
Parameters:
-
lower(float) –Value at the lower end of the range.
-
upper(float) –Value at the upper end of the range.
Source code in src/cfa_subgroup_imputer/variables.py
assert_range_spanned_exactly(range, ranges)
Checks that the provided ranges, in aggregate, span exactly range.
[Range(0., 1.), Range(1., 10.)] span Range(0., 10.) [Range(0., 1.), Range(1., 10.1)] does not span Range(0., 10.) [Range(0., 1.), Range(2., 10.)] does not span Range(0., 10.)