test for update-code-from-shorthand
This commit is contained in:
parent
47ef6fd17d
commit
464c5e14ea
@ -947,3 +947,4 @@ public Set<RelatedEntity> getRelatedEntities() {
|
||||
- **Add transient properties** with `@JsonIgnore` for UI access to related data
|
||||
- **Use service layer** to populate transient relationships before rendering
|
||||
- **Never return repository results directly** to templates without relationship population
|
||||
|
||||
@ -285,3 +285,4 @@ For organizations with multiple repositories:
|
||||
> Ensure all tests pass and CI/CD workflows are updated.
|
||||
|
||||
---
|
||||
|
||||
@ -300,3 +300,4 @@ First Class Collections: a class that contains an array as an attribute should n
|
||||
- [Object Calisthenics - Original 9 Rules by Jeff Bay](https://www.cs.helsinki.fi/u/luontola/tdd-2009/ext/ObjectCalisthenics.pdf)
|
||||
- [ThoughtWorks - Object Calisthenics](https://www.thoughtworks.com/insights/blog/object-calisthenics)
|
||||
- [Clean Code: A Handbook of Agile Software Craftsmanship - Robert C. Martin](https://www.oreilly.com/library/view/clean-code-a/9780136083238/)
|
||||
|
||||
@ -18,6 +18,7 @@ You are an AI pair programming with a USER. Your goal is to help the USER create
|
||||
🔴 **CRITICAL**: You MUST limit the number of questions you ask at any given time, try to limit it to one question, or AT MOST: three related questions.
|
||||
|
||||
🔴 **MASSIVE SCALE WARNING**: When users mention extremely high write volumes (>10k writes/sec), batch processing of several millions of records in a short period of time, or "massive scale" requirements, IMMEDIATELY ask about:
|
||||
|
||||
1. **Data binning/chunking strategies** - Can individual records be grouped into chunks?
|
||||
2. **Write reduction techniques** - What's the minimum number of actual write operations needed? Do all writes need to be individually processed or can they be batched?
|
||||
3. **Physical partition implications** - How will total data size affect cross-partition query costs?
|
||||
@ -143,16 +144,19 @@ For each pair of related containers, ask:
|
||||
When entities have 30-70% access correlation, choose between:
|
||||
|
||||
**Multi-Document Container (Same Container, Different Document Types):**
|
||||
|
||||
- ✅ Use when: Frequent joint queries, related entities, acceptable operational coupling
|
||||
- ✅ Benefits: Single query retrieval, reduced latency, cost savings, transactional consistency
|
||||
- ❌ Drawbacks: Shared throughput, operational coupling, complex indexing
|
||||
|
||||
**Separate Containers:**
|
||||
|
||||
- ✅ Use when: Independent scaling needs, different operational requirements
|
||||
- ✅ Benefits: Clean separation, independent throughput, specialized optimization
|
||||
- ❌ Drawbacks: Cross-partition queries, higher latency, increased cost
|
||||
|
||||
**Enhanced Decision Criteria:**
|
||||
|
||||
- **>70% correlation + bounded size + related operations** → Multi-Document Container
|
||||
- **50-70% correlation** → Analyze operational coupling:
|
||||
- Same backup/restore needs? → Multi-Document Container
|
||||
@ -216,10 +220,12 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
- **Consistency Level**: [Session/Eventual/Strong - with justification]
|
||||
|
||||
### Indexing Strategy
|
||||
|
||||
- **Indexing Policy**: [Automatic/Manual - with justification]
|
||||
- **Included Paths**: [specific paths that need indexing for query performance]
|
||||
- **Excluded Paths**: [paths excluded to reduce RU consumption and storage]
|
||||
- **Composite Indexes**: [multi-property indexes for ORDER BY and complex filters]
|
||||
|
||||
```json
|
||||
{
|
||||
"compositeIndexes": [
|
||||
@ -230,10 +236,12 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
- **Access Patterns Served**: [Pattern #2, #5 - specific pattern references]
|
||||
- **RU Impact**: [expected RU consumption and optimization reasoning]
|
||||
|
||||
## Access Pattern Mapping
|
||||
|
||||
### Solved Patterns
|
||||
|
||||
🔴 CRITICAL: List both writes and reads solved.
|
||||
@ -246,6 +254,7 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
|---------|-----------|---------------|-------------------|---------------------|
|
||||
|
||||
## Hot Partition Analysis
|
||||
|
||||
- **MainContainer**: Pattern #1 at 500 RPS distributed across ~10K users = 0.05 RPS per partition ✅
|
||||
- **Container-2**: Pattern #4 filtering by status could concentrate on "ACTIVE" status - **Mitigation**: Add random suffix to partition key
|
||||
|
||||
@ -278,6 +287,7 @@ A JSON representation showing 5-10 representative documents for the container
|
||||
- [ ] Trade-offs explicitly documented and justified ✅
|
||||
- [ ] Global distribution strategy detailed ✅
|
||||
- [ ] Cross-referenced against `cosmosdb_requirements.md` for accuracy ✅
|
||||
|
||||
```
|
||||
|
||||
## Communication Guidelines
|
||||
@ -594,11 +604,13 @@ When making aggregate design decisions:
|
||||
Example cost analysis:
|
||||
|
||||
Option 1 - Denormalized Order+Customer:
|
||||
|
||||
- Read cost: 1000 RPS × 1 RU = 1000 RU/s
|
||||
- Write cost: 50 order updates × 5 RU + 10 customer updates × 50 orders × 5 RU = 2750 RU/s
|
||||
- Total: 3750 RU/s
|
||||
|
||||
Option 2 - Normalized with separate query:
|
||||
|
||||
- Read cost: 1000 RPS × (1 RU + 3 RU) = 4000 RU/s
|
||||
- Write cost: 50 order updates × 5 RU + 10 customer updates × 5 RU = 300 RU/s
|
||||
- Total: 4300 RU/s
|
||||
@ -620,6 +632,7 @@ When facing massive write volumes, **data binning/chunking** can reduce write op
|
||||
**Result**: 90M records → 900k documents (95.7% reduction)
|
||||
|
||||
**Implementation**:
|
||||
|
||||
```json
|
||||
{
|
||||
"id": "chunk_001",
|
||||
@ -635,17 +648,20 @@ When facing massive write volumes, **data binning/chunking** can reduce write op
|
||||
```
|
||||
|
||||
**When to Use**:
|
||||
|
||||
- Write volumes >10k operations/sec
|
||||
- Individual records are small (<2KB each)
|
||||
- Records are often accessed in groups
|
||||
- Batch processing scenarios
|
||||
|
||||
**Query Patterns**:
|
||||
|
||||
- Single chunk: Point read (1 RU for 100 records)
|
||||
- Multiple chunks: `SELECT * FROM c WHERE STARTSWITH(c.partitionKey, "account_test_")`
|
||||
- RU efficiency: 43 RU per 150KB chunk vs 500 RU for 100 individual reads
|
||||
|
||||
**Cost Benefits**:
|
||||
|
||||
- 95%+ write RU reduction
|
||||
- Massive reduction in physical operations
|
||||
- Better partition distribution
|
||||
@ -656,6 +672,7 @@ When facing massive write volumes, **data binning/chunking** can reduce write op
|
||||
When multiple entity types are frequently accessed together, group them in the same container using different document types:
|
||||
|
||||
**User + Recent Orders Example:**
|
||||
|
||||
```json
|
||||
[
|
||||
{
|
||||
@ -676,23 +693,27 @@ When multiple entity types are frequently accessed together, group them in the s
|
||||
```
|
||||
|
||||
**Query Patterns:**
|
||||
|
||||
- Get user only: Point read with id="user_123", partitionKey="user_123"
|
||||
- Get user + recent orders: `SELECT * FROM c WHERE c.partitionKey = "user_123"`
|
||||
- Get specific order: Point read with id="order_456", partitionKey="user_123"
|
||||
|
||||
**When to Use:**
|
||||
|
||||
- 40-80% access correlation between entities
|
||||
- Entities have natural parent-child relationship
|
||||
- Acceptable operational coupling (throughput, indexing, change feed)
|
||||
- Combined entity queries stay under reasonable RU costs
|
||||
|
||||
**Benefits:**
|
||||
|
||||
- Single query retrieval for related data
|
||||
- Reduced latency and RU cost for joint access patterns
|
||||
- Transactional consistency within partition
|
||||
- Maintains entity normalization (no data duplication)
|
||||
|
||||
**Trade-offs:**
|
||||
|
||||
- Mixed entity types in change feed require filtering
|
||||
- Shared container throughput affects all entity types
|
||||
- Complex indexing policies for different document types
|
||||
@ -727,6 +748,7 @@ When cost analysis shows:
|
||||
Example analysis:
|
||||
|
||||
Product + Reviews Aggregate Analysis:
|
||||
|
||||
- Access pattern: View product details (no reviews) - 70%
|
||||
- Access pattern: View product with reviews - 30%
|
||||
- Update frequency: Products daily, Reviews hourly
|
||||
@ -777,6 +799,7 @@ Example: ProductReview container
|
||||
Composite partition keys are useful when data has a natural hierarchy and you need to query it at multiple levels. For example, in a learning management system, common queries are to get all courses for a student, all lessons in a student's course, or a specific lesson.
|
||||
|
||||
StudentCourseLessons container:
|
||||
|
||||
- Partition Key: student_id
|
||||
- Document types with hierarchical IDs:
|
||||
|
||||
@ -804,6 +827,7 @@ StudentCourseLessons container:
|
||||
```
|
||||
|
||||
This enables:
|
||||
|
||||
- Get all data: `SELECT * FROM c WHERE c.partitionKey = "student_123"`
|
||||
- Get course: `SELECT * FROM c WHERE c.partitionKey = "student_123" AND c.courseId = "course_456"`
|
||||
- Get lesson: Point read with partitionKey="student_123" AND id="lesson_789"
|
||||
@ -813,6 +837,7 @@ This enables:
|
||||
Composite partition keys are useful to model natural query boundaries.
|
||||
|
||||
TenantData container:
|
||||
|
||||
- Partition Key: tenant_id + "_" + customer_id
|
||||
|
||||
```json
|
||||
@ -831,12 +856,14 @@ Natural because queries are always tenant-scoped and users never query across te
|
||||
Cosmos DB supports rich date/time operations in SQL queries. You can store temporal data using ISO 8601 strings or Unix timestamps. Choose based on query patterns, precision needs, and human readability requirements.
|
||||
|
||||
Use ISO 8601 strings for:
|
||||
|
||||
- Human-readable timestamps
|
||||
- Natural chronological sorting with ORDER BY
|
||||
- Business applications where readability matters
|
||||
- Built-in date functions like DATEPART, DATEDIFF
|
||||
|
||||
Use numeric timestamps for:
|
||||
|
||||
- Compact storage
|
||||
- Mathematical operations on time values
|
||||
- High precision requirements
|
||||
@ -918,6 +945,7 @@ This pattern ensures uniqueness constraints while maintaining performance within
|
||||
Hierarchical Partition Keys provide natural query boundaries using multiple fields as partition key levels, eliminating synthetic key complexity while optimizing query performance.
|
||||
|
||||
**Standard Partition Key**:
|
||||
|
||||
```json
|
||||
{
|
||||
"partitionKey": "account_123_test_456_chunk_001" // Synthetic composite
|
||||
@ -925,6 +953,7 @@ Hierarchical Partition Keys provide natural query boundaries using multiple fiel
|
||||
```
|
||||
|
||||
**Hierarchical Partition Key**:
|
||||
|
||||
```json
|
||||
{
|
||||
"partitionKey": {
|
||||
@ -936,17 +965,20 @@ Hierarchical Partition Keys provide natural query boundaries using multiple fiel
|
||||
```
|
||||
|
||||
**Query Benefits**:
|
||||
|
||||
- Single partition queries: `WHERE accountId = "123" AND testId = "456"`
|
||||
- Prefix queries: `WHERE accountId = "123"` (efficient cross-partition)
|
||||
- Natural hierarchy eliminates synthetic key logic
|
||||
|
||||
**When to Consider HPK**:
|
||||
|
||||
- Data has natural hierarchy (tenant → user → document)
|
||||
- Frequent prefix-based queries
|
||||
- Want to eliminate synthetic partition key complexity
|
||||
- Apply only for Cosmos NoSQL API
|
||||
|
||||
**Trade-offs**:
|
||||
|
||||
- Requires dedicated tier (not available on serverless)
|
||||
- Newer feature with less production history
|
||||
- Query patterns must align with hierarchy levels
|
||||
@ -1006,10 +1038,12 @@ Example: Order Processing System
|
||||
• Update pattern: Individual item status updates (100 RPS)
|
||||
|
||||
Option 1 - Combined aggregate (single document):
|
||||
|
||||
- Read cost: 1000 RPS × 1 RU = 1000 RU/s
|
||||
- Write cost: 100 RPS × 10 RU (rewrite entire order) = 1000 RU/s
|
||||
|
||||
Option 2 - Separate items (multi-document):
|
||||
|
||||
- Read cost: 1000 RPS × 5 RU (query multiple items) = 5000 RU/s
|
||||
- Write cost: 100 RPS × 10 RU (update single item) = 1000 RU/s
|
||||
|
||||
@ -1036,6 +1070,7 @@ Example: Session tokens with 24-hour expiration
|
||||
```
|
||||
|
||||
Container-level TTL configuration:
|
||||
|
||||
```json
|
||||
{
|
||||
"defaultTtl": -1, // Enable TTL, no default expiration
|
||||
|
||||
156
scripts/indent-nested-md-code.js
Normal file
156
scripts/indent-nested-md-code.js
Normal file
@ -0,0 +1,156 @@
|
||||
#!/usr/bin/env node
|
||||
/**
|
||||
* Indent nested Markdown code fences (``` ... ```) that appear inside other fenced code blocks
|
||||
* to ensure proper rendering on GitHub. Only modifies .md/.prompt.md/.instructions.md files
|
||||
* under the specified folders (prompts/, instructions/, collections/).
|
||||
*
|
||||
* Strategy:
|
||||
* - Parse each file line-by-line
|
||||
* - Detect outer fenced code blocks (up to 3 leading spaces + backticks >= 3)
|
||||
* - Within an outer block, find any inner lines that also start with a fence marker (```...)
|
||||
* that are not the true closing line of the outer block (same tick length and no language info),
|
||||
* and treat them as the start of a nested block
|
||||
* - Indent the inner block from its opening fence line through its next fence line (closing)
|
||||
* by prefixing each of those lines with four spaces
|
||||
* - Repeat for multiple nested "inner blocks" within the same outer block
|
||||
*
|
||||
* Notes:
|
||||
* - We only consider backtick fences (```). Tilde fences (~~~) are uncommon in this repo and not targeted
|
||||
* - We preserve existing content and whitespace beyond the added indentation for nested fences
|
||||
*/
|
||||
|
||||
const fs = require('fs');
|
||||
const path = require('path');
|
||||
|
||||
const ROOT = process.cwd();
|
||||
const TARGET_DIRS = ['prompts', 'instructions', 'collections'];
|
||||
const VALID_EXTS = new Set(['.md', '.prompt.md', '.instructions.md']);
|
||||
|
||||
function walk(dir) {
|
||||
const results = [];
|
||||
const stack = [dir];
|
||||
while (stack.length) {
|
||||
const current = stack.pop();
|
||||
let entries = [];
|
||||
try {
|
||||
entries = fs.readdirSync(current, { withFileTypes: true });
|
||||
} catch {
|
||||
continue;
|
||||
}
|
||||
for (const ent of entries) {
|
||||
const full = path.join(current, ent.name);
|
||||
if (ent.isDirectory()) {
|
||||
stack.push(full);
|
||||
} else if (ent.isFile()) {
|
||||
const ext = getEffectiveExt(ent.name);
|
||||
if (VALID_EXTS.has(ext)) {
|
||||
results.push(full);
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
return results;
|
||||
}
|
||||
|
||||
function getEffectiveExt(filename) {
|
||||
if (filename.endsWith('.prompt.md')) return '.prompt.md';
|
||||
if (filename.endsWith('.instructions.md')) return '.instructions.md';
|
||||
return path.extname(filename).toLowerCase();
|
||||
}
|
||||
|
||||
// Regex helpers
|
||||
const fenceLineRe = /^(?<indent> {0,3})(?<ticks>`{3,})(?<rest>.*)$/; // up to 3 spaces + ``` + anything
|
||||
|
||||
function processFile(filePath) {
|
||||
const original = fs.readFileSync(filePath, 'utf8');
|
||||
const lines = original.split(/\r?\n/);
|
||||
|
||||
let inOuter = false;
|
||||
let outerIndent = '';
|
||||
let outerTicksLen = 0;
|
||||
let i = 0;
|
||||
let changed = false;
|
||||
|
||||
while (i < lines.length) {
|
||||
const line = lines[i];
|
||||
const m = line.match(fenceLineRe);
|
||||
|
||||
if (!inOuter) {
|
||||
// Look for start of an outer fence
|
||||
if (m) {
|
||||
inOuter = true;
|
||||
outerIndent = m.groups.indent || '';
|
||||
outerTicksLen = m.groups.ticks.length;
|
||||
}
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// We're inside an outer fence
|
||||
if (m) {
|
||||
// Is this the true closing fence for the current outer block?
|
||||
const indentLen = (m.groups.indent || '').length;
|
||||
const ticksLen = m.groups.ticks.length;
|
||||
const restTrim = (m.groups.rest || '').trim();
|
||||
const isOuterCloser = indentLen <= outerIndent.length && ticksLen === outerTicksLen && restTrim === '';
|
||||
if (isOuterCloser) {
|
||||
// End of outer block
|
||||
inOuter = false;
|
||||
outerIndent = '';
|
||||
outerTicksLen = 0;
|
||||
i++;
|
||||
continue;
|
||||
}
|
||||
|
||||
// Otherwise, treat as nested inner fence start; indent until the matching inner fence (inclusive)
|
||||
changed = true;
|
||||
const innerTicksLen = ticksLen;
|
||||
lines[i] = ' ' + lines[i];
|
||||
i++;
|
||||
// Indent lines until we find a fence line with the same tick length (closing the inner block)
|
||||
while (i < lines.length) {
|
||||
const innerLine = lines[i];
|
||||
const m2 = innerLine.match(fenceLineRe);
|
||||
lines[i] = ' ' + innerLine;
|
||||
i++;
|
||||
if (m2 && m2.groups.ticks.length === innerTicksLen) break; // we've indented the closing inner fence; stop
|
||||
}
|
||||
continue;
|
||||
}
|
||||
|
||||
// Regular line inside outer block
|
||||
i++;
|
||||
}
|
||||
|
||||
if (changed) {
|
||||
fs.writeFileSync(filePath, lines.join('\n'));
|
||||
return true;
|
||||
}
|
||||
return false;
|
||||
}
|
||||
|
||||
function main() {
|
||||
const roots = TARGET_DIRS.map(d => path.join(ROOT, d));
|
||||
let files = [];
|
||||
for (const d of roots) {
|
||||
if (fs.existsSync(d) && fs.statSync(d).isDirectory()) {
|
||||
files = files.concat(walk(d));
|
||||
}
|
||||
}
|
||||
|
||||
let modified = 0;
|
||||
for (const f of files) {
|
||||
try {
|
||||
if (processFile(f)) modified++;
|
||||
} catch (err) {
|
||||
// Log and continue
|
||||
console.error(`Error processing ${f}:`, err.message);
|
||||
}
|
||||
}
|
||||
|
||||
console.log(`Processed ${files.length} files. Modified ${modified} file(s).`);
|
||||
}
|
||||
|
||||
if (require.main === module) {
|
||||
main();
|
||||
}
|
||||
Loading…
x
Reference in New Issue
Block a user